digital surface model extraction and refinement through

PFG 2012 / 4, 0317–0330 ArticleStuttgart, August 2012

© 2012 E. Schweizerbar t'sche Verlagsbuchhandlung, Stuttgar t, Ger many www.schweizerbar t.deDOI: 10.1127/1432-8364/2012/0120 1432-8364/12/0120 $ 3.25

Digital Surface Model Extraction and Refinementthrough Image Segmentation – Application to theISPRS Benchmark Stereo Dataset

Daniela Poli & Pierre Soille, Ispra (VA), Italy

Keywords: high resolution, data fusion, digital surface models, segmentation

VHR satellite images are generally used forthis scope, as they almost cover any area of theplanet, in particular remotely located areas. Todescribe the pre-event and post-event scenar-ios the available data are analysed and even-tually fused in order to achieve the most ac-curate and reliable information. With respectto automatic 3D information extraction, theavailability of accurate and detailed DSMs is acr ucial issue for automatic building detectionand subsequent damage estimation. Whilethe acquisition of VHR stereo scenes can beplanned after the event and used to model thepost-event 3D scenario, VHR stereo scenesare generally not available for data acquisi-tion before the event; on the other hand, single

1 Introduction

In the last years the number of Earth-observa-tion platforms equipped with sensors deliver-ing ver y high resolution (VHR) images withground sample distances (GSD) smaller than1 m has increased. Those sensors are charac-terized by different geometric, radiometric,spectral and operational specifications, and inmost cases allow for the acquisition of mul-ti-angle imager y for automatic digital surfacemodels (DSM) generation and 3D feature ex-traction. One of the tasks of the Joint ResearchCentre of the European Commission in Ispra(Italy) is to assess changes in land cover af-ter events like natural disasters or conflicts.

Summary: This paper proposes a methodology forthe geometric refinement of a digital surface model(DSM) using high or ver y high resolution satellitescenes through an advanced hierarchical imagesegmentation and shows the results obtained on adataset of Catalonia, Spain, provided by ISPRS WGI/4 through the project “Benchmarking and qualityanalysis of DEM generated from high and ver yhigh resolution optical stereo satellite data”. Duringthis experiment a WorldView-1 quasi-nadir scenewas used for the enhancement of the DSM gener-ated by a Car tosat-1 stereopair. The original andfinal DSMs are compared to the lidar DSM on thesame area for quality analysis.

Zusammenfassung: Erstellung von Ober flächen-modellen und deren Verbesser ung durch hierarchi-sche Bildsegmentier ung am Beispiel der ISPRSBenchmark-Stereodaten. Dieser Ar tikel stellt eineMethode zur geometrischen Verbesser ung einesdigitalen Ober flächenmodells (DSM) durch hierar-chische Bildsegmentier ung von hoch bzw. sehrhoch auflösenden Satellitenbilder n vor. Die Metho-de wurde an einem Datensatz aus Katalonien inSpanien getestet, der im Rahmen des ISPRS WGI/4 Projektes „Benchmarking and quality analysisof DEM generated from high and ver y high resolu-tion optical stereo satellite data“ zur Verfügunggestellt wurde. In diesem Experiment wurde einaus einem Car tosat-1-Stereopaar erstelltes DSMdurch eine WorldView-1 Quasi-Nadir-Szene ver-besser t. Zur Bewer tung der Methode wurde einLidar DSM als Referenz ver wendet. Der Ar tikelbeschreibt die Methode sowie die Ergebnisse desTests und der Qualitätsanalyse.

318 Photogrammetrie • Fernerkundung • Geoinformation 4/2012

DSM (klaUs et al. 2006, kRaUss & ReinaRtz

2010). For example, in the approach proposedby kRaUss & ReinaRtz (2010), one stereo im-age is segmented and transfer red to the dis-parity map, then for each segment the originaldisparity map is filled with suitable inter pola-tion of the disparities. In the proposed method,segmentation of a VHR scene is used to refinea given DSM at a coarser resolution.

2.2 Proposed Method

The principal steps of the proposed approachare summarized in Fig. 1. The initial DSM canbe any surface model generated by matchingtechniques from stereo images or other sourc-es. Unless already orthorectified, the VHRscene is oriented using ground control infor-mation and orthorectified on the initial DSM.The generated orthophoto is segmented withalpha-omega connectivity (soille 2008) andthe segments are overlaid to the existing DSM.After calculating the statistics (mean, median,sigma, minimum and maximum values) of theheights of the points falling into each segment,the new surface model is calculated from theexisting one by imposing that the height val-ues that fall into the same segment follow acertain mathematic function, i.e. constant val-ue or planar surface. In the initial version ofthe algorithm, the new height values for eachsegment were calculated as the minimum,mean or maximum values of the original DSMheights, or as the result of planar fitting.

A more advanced approach for the calcula-tion of the enhanced height values was thendeveloped. It is based on exploiting not onlythe radiometric information of the VHR scene,but also the information contained in the DSMshape. The aim is to enhance the surfacemodel using the maximum value in segmentsdeemed to cor respond to a building, and usingthe minimum value for segments supposed tomatch ground areas between buildings. Build-ings are generally represented in the DSM asconvex surfaces while ground areas betweenbuildings are likely to match concave surfaces(the term convex and concave are used in thesense “convex from above” and “concave fromabove”, respectively). Therefore, the originalDSM is segmented to extract concave, convex

VHR scenes are likely to be available in thearchives, as well as DSMs at medium or lowresolution from multi-line optical sensors withsimultaneous along-track stereo acquisition,like SPOT-5/HRS, Cartosat-1, ALOS-PRISM,or other sources. In this context we developeda strategy for data fusion, in order to enhancethe surface models using information fromsingle scenes. It is enhanced in the sense thatthe final DSM is shar per and therefore likelyto be more suitable for subsequent 3D shapedetection. We tested the method on the datasetprovided by ISPRS Working Group I/4 withinthe project “Benchmarking and quality analy-sis of DEM generated from high and ver y highresolution optical stereo satellite data” (Rei-naRtz et al. 2010). The approach and first re-sults were presented at the Workshop “High-Resolution Earth Imaging for Geospatial In-formation” held in Hannover (Germany) inJune 2011 (poli & soille 2011); in this workthe latest developments in the methodologyand the achieved results are reported and dis-cussed.

2 Methodology

2.1 Related Works

The combined use of digital imager y and sur-face models mainly finds applications in auto-matic object extraction and 3D reconstr uctionfrom aerial data or ver y high resolution satel-lite sensors (baltsaVias et al. 1995, JaYnes etal. 1996, papaRoDitis et al. 1998, lU et al. 2002,tao & YasUoka 2002, lU et al. 2006, li et al.2008, aReFi 2009). Few works aim at usingsatellite imager y for DSM enhancement. InMaiRe (2010), user-defined semantic contents(sea, lakes, buildings, roads, etc.) are extractedusing a supervised classification in high reso-lution satellite imager y; then in cor respond-ence of each class the surface is modelled byplane surfaces with geometric constraints giv-en by the topological properties of each classand neighbour regions. This approach wasapplied, for example, for the enhancementof SRTM-X using a 2.5 m ground resolutionSPOT-5 scene. In the computer vision commu-nity, several matching approaches employ im-age segmentation during the generation of the

D. Poli & P. Soille, Digital Surface Model Extraction 319

false on the union of any pair of adjacent seg-ments (HoRoWitz & paVliDis 1976). Recently,a segmentation technique producing a hier-archical partitioning of the image definitiondomain under connectivity constraints cor re-sponding to logical predicates was proposed(soille 2008). Given the input constraints, theresulting partition is uniquely defined, con-trar y to most segmentation techniques. Byincreasing the threshold levels associated toeach connectivity constraint, one obtains a se-ries of fine-to-coarse partitions with the pixelsat the lowest level and the whole image defi-nition domain at the highest level of the hier-archy.

Constrained connectivity relies on a dis-similarity measure computed between eachpair of adjacent pixels. The absolute differ-ence between the intensity values was usedfor all experiments hereafter (see soille 2011for other dissimilarity measures). Two pixelsare said to be alpha-connected if there existsa path linking these two pixels in such a waythat the dissimilarity between all pairs of ad-jacent pixels of the path does not exceed the

and flat regions, and then the new height val-ues are computed differently, depending onwhether the segment falls in a concave, con-vex or flat region. The decision r ule is:● Use minimum value if the segment falls in

a concave region;● Use maximum value if the segment falls in

a convex region;● Use median value or planar fitting if the

segment falls in a flat region.In practice, to exclude possible outliers, the

10 % (resp. 90 %) percentile is used instead ofthe minimum (resp. maximum) value. In thenext paragraphs the algorithms used for imageand DSM segmentation are described.

2.3 Image Segmentation byConstrained Connectivity

Given a logical predicate, the segmentation ofan image is defined as the partition of the im-age definition domain into non-overlappingregions called segments, such that the logicalpredicate retur ns tr ue on ever y segment but

Fig. 1: Proposed workflow for DSM enhancement.


increasing this value, a fine-to-coarse hierar-chy of connected components is obtained. Inpractice, it is desirable to obtain segmentationas coarse as possible but retaining all str uc-tures of interest in the image given the con-sidered application. In this study, since we areinterested in buildings and other man-madestr uctures, the parameters need to be selectedin such a way that these objects are simplifiedas much as possible, i.e. ideally are matchedby one connected component, but not mergedwith other objects.

To further favour linking within homogene-ous regions while preventing it through tran-sitions, an image pre-processed by the edgeshar pening technique described in (soille

2011) is applied before segmentation. The pre-processing consists in considering as seeds alllocal maxima of the input image and propagat-ing their values so as to cover the whole imagedefinition domain. The propagation is con-trolled by the original image values (the small-er the difference between a seed and a non-seeded adjacent point, the greater the speed).Alter natively, rather than pre-processing theimage, the dissimilarity measure proposed in(soille 2011) could be used to prevent linkingthrough transitions while favouring it withinhomogeneous regions.

value of alpha. The relation ‘to be alpha-con-nected’ is an equivalence relation (reflexive,symmetric, and transitive relation) and there-fore it leads to a unique partition of the imagedefinition domain into regions of maximal ex-tent called alpha-connected components.

The constrained connected component of agiven pixel is then defined as the largest al-pha-connected component of this pixel thatsatisfies a series of constraints. Any numberof constraints may be considered. Given a lo-cal dissimilarity defined as the absolute dif-ference, the most natural constraint is definedin terms of a threshold level on the range ofthe grey level values of the connected com-ponents. This constraint is called the omega-constraint and its associated threshold valueis the omega threshold. Denoting by α andω the values of the alpha and omega thresh-olds, respectively, the (α,ω)-connected com-ponent of a pixel is defined as the largest α’-connected component such that α’ is less thanor equal to α and the range of the α’-connectedcomponent is less than or equal to ω (soille

2008). The omega constraint prevents linkingthrough transitions, a well-known problem ofsingle-linkage clustering.

A typical choice is to consider the same val-ues for the alpha and omega thresholds. By

Fig. 2: Example of constrained connectivity partitioning of a 512 × 256 sample of a WorldView-1image resampled at 2.5 m, rescaled to byte data type, and contrast enhanced. The partitions aredisplayed for the alpha and omega thresholds values equal to 0, 32, 64, and 128 respectively (fromfine to coarse partitions).


str ucturing element. In experiments with sat-ellite images, the proposed method demon-strated good performance in terms of shapedescription and retained small but significantregions in images.

3 Experimental Results

In our experiment the algorithms described insection 2 were used for the enhancement of aDSM generated from a stereopair acquired byCartosat-1 (C1) sensor using a WorldView-1(WV1) scene. In the next paragraphs the dataand detailed processing will be presented anddiscussed.

3.1 Data Description

The Working Group 4 of Commission I on“Geometric and Radiometric Modelling ofOptical Spacebor ne Sensors” is providing onits website several stereo datasets from highand ver y high resolution spacebor ne stereosensors on three areas in Catalonia, Spain,covering urban, r ural, and forest areas in flatand medium undulated ter rain as well as steepmountainous ter rain. In addition to these data-sets, a lidar DSM generated by the InstitutCartogràfic de Catalunya (ICC) is providedas a reference for comparison (ReinaRtz etal. 2010). Among the three available regions(Ter rassa, La Mora, Vacarisses), the datasetlocated over Ter rassa was chosen for our ex-periments.

The dataset consists of:● a subset of a stereopair acquired by WV1

on August 28, 2008, composed by a na-dir (-1.3º) and off-nadir (33.9º) scene, withground sample distances of 0.50 m and0.76 m, respectively; both scenes are pan-chromatic and have size 10,000 x 10,000pixels; rational polynomial coefficients(RPC) are available; for our experimentonly the nadir scene is used;

● a stereopair acquired by C1 in Febr uar y2008, composed of a backward (-5º) andfor ward (33.9º) looking scene, with groundsample distance of 2.5 m; both scenes arepanchromatic; RPCs are available;

● ground coordinates of points;

An example of constrained connectivitypartitioning is shown in Fig. 2. In this figure,the original image is partitioned with increas-ing values of alpha and omega parameters. Tohighlight the hierarchical property of the re-sulting partitions, random colours were givento the connected components of the finest par-tition while the colour of a given connectedcomponent of a subsequent level is inheritedfrom the largest connected component of theprevious level, contained by this connectedcomponent.

2.4 DSM Segmentation

The approach used for DSM segmentationand identification of concave, convex and flatregions was proposed by pesaResi & bene-Diktsson (2001) and applied to satellite images.The method is a mor phological segmentationby the derivative of the mor phological profile,based on the use of residuals from opening andclosing (levelling) by reconstr uction.

A str ucture or an “object” in an image is de-fined as a connected component of pixels shar-ing the same mor phological characteristics,i.e. “flat”, “concave”, “convex” in case of thelocal curvature of the grey level function sur-face. The residuals between any opening byreconstr uction (or closing by reconstr uction)and their original function are inter preted as ameasure of the relative brightness of the str uc-ture (or relative darkness). Then, one member-ship function can be defined cor responding tothe class “convex” and another membershipfunction cor responding to the class “concave”.The levelling algorithm is rewritten as a de-cision r ule based on the greatest value of themembership function. The segmented imageof the characteristic is a tessellation of threedifferent labels like “convex”, “concave” and“flat”. For this segmented image, pixels wherethe lower levelling is strictly lower than theoriginal image are labelled “convex”, and pix-els where the upper levelling is strictly greaterthan the grey function are labelled as “con-cave”. Pixels labelled “flat” have maintainedthe same value of the grey function in both theupper and lower levelling. Therefore, thosepixels have been indifferent to the erosion/dilation-reconstr uction process with a given


PP (SATellite image Precision Processing) by4DiXplorer AG was used (zHang 2005).

The C1 stereopair was oriented by esti-mating the cor rection of the available RPCthrough an affine transformation using nineground control points (GCPs). Then a DSMwas generated using the advanced approachin SAT-PP based on a coarse-to-fine hierar-chical solution with an effective combina-tion of several image matching methods andautomatic quality indication (zHang & gRün

2004). To improve the conditions for featureextraction and matching, the images were pre-processed using Wallis filter. Few seed pointswere measured in the epipolar images in ste-reo mode to provide the initial approximationof the surface. The DSM was generated with aregular grid space of 2.5 m (DSM_C1, Fig. 3).No further editing was applied on the surfacemodel. In the DSM it is possible to distinguishurban areas, but buildings are not well definedfrom adjacent buildings or roads. In open are-as, the surface model follows the ter rain shapewithout outliers.

The nadir WV1 scene was georeferenced,with the estimation of affine cor rection pa-rameters of the given RPCs by means ofGCPs. This was followed by the orthorectifi-cation, using the DSM_C1 as elevation model.As a result an orthoimage at 2.5 m GSD wasproduced.

● a dense 3D point cloud acquired on Novem-ber 27, 2007 with airbor ne lidar.The area is characterized by urban, r ural

and forest cover on flat and hilly ter rain (Rei-naRtz et al. 2010). The ter rain elevation rangesfrom 200 m to 430 m.

As explained in the introduction, in this pa-per we propose an approach to refine a coarseresolution DSM using a single VHR scene, asgenerally VHR stereopairs are not available todescribe pre-event scenarios. Therefore, evenif a VHR stereopair is available, from the Ter-rassa dataset we used only the C1 stereopairto generate a DSM and the nadir WV1 sceneto refine it. The processing workflow includesthe following steps:● orientation of C1 stereo scenes,● DSM generation at 2.5 m grid spacing

(DSM_C1),● orientation of WV1 nadir scene,● orthorectification of WV1 scene using

DSM_C1 at ground sample distance of2.5 m,

● segmentation of the WV1 orthoimage,● enhancement of DSM_C1 with generation

of a new DSM at 2.5 m grid spacing.

3.2 Orthorectification and DSMGeneration

For the photogrammetric processing of C1 andWV1 scenes the commercial software SAT-

Fig. 3: Colour-shaded visualization of DSM generated from Cartosat-1 stereo scenes over Ter-rassa area. Overview of the full model (left) and zoom on urban and open areas (right).


ondar y roads on different levels, a river withbridges, open areas with vegetation. Follow-ing the methodology previously described,four DSMs were calculated from the originalDSM, using the minimum value (ENH_MIN),the maximum value (ENH_MAX), the meanvalue (ENH_AVG) of the heights in each seg-ment, and planar fitting (ENH_PLA). From avisual comparison of the original DSM (Fig. 4(c)) and the enhanced DSMs with average andplanar fitting (Fig. 5 (a) and (b) respectively)on the subset mentioned above, it is evidentthat in the enhanced DSMs the building edg-

3.3 Segmentation and DSMEnhancement

The WV1 orthoimage was pre-processed withedge-shar pening and segmented. For the seg-mentation, a value of 50 intensity levels forboth the alpha and omega thresholds wasfound to be optimal, in the sense that it ena-bled the formation of connected componentsmatching the subparts of the building roofswhile avoiding their merging with adjacentobjects. An example is given in Fig. 4 (a) and(b) on a subset with buildings, main and sec-

(a) or thophoto (b) segmented image

(c) DSM_C1 (d) Lidar DSM

Fig. 4: Results of DSM enhancement over a subset. (a) WV1 orthoscene at 2.5 m grid spacing, (b)segmentation, using alpha = omega = 50 over a subset, (c) original Cartosat-1 DSM, (d) referencelidar DSM used for quality analysis.


erence one and to DSM_C1 to evaluate if theoriginal er rors of the DSM_C1 were increas-ing or decreasing with the respect to the lidarafter the refinement. As expected, we noticein Fig. 6 that by applying the minimum values,the er rors in cor respondence of ground leveldecreases, while using the maximum value theenhanced DSM improves in cor respondenceof buildings. Therefore the criteria should becombined. The original DSM was segmentedin concave, convex and flat regions and thenew DSM was calculated with minimum andmaximum height values in segments falling inconcave and convex regions respectively, andmedium value elsewhere (ENH_HYB).

es are more delineated and objects at differentheights, like rivers and bridges, are better de-fined than in DSM_C1. In homogeneous areas(roof faces, roads, parking areas, fields, andso on) small details due to noise are removed.For the pur poses of our research, we focusedthe analysis on buildings. With respect to thefunction used to enhance the DSM, there is nota significant difference between average andplanar fitting at this resolution, as in DSM_C1itself it is not possible to recognize shape de-tails, like roof faces, with sufficient accuracy.

The enhanced DSMs obtained using theminimum and maximum values (ENH_MINand ENH_MAX) were compared to the ref-

Fig. 5: Results of DSM enhancement over a subset using (a) average values, (b) planar fitting, (c)a combination of minimum, maximum and median values.


in the enhanced DSM is similar to the one inthe original DSM, but the profiles of the build-ings are improved (profiles 1, 2 and 5).

The presented method shows a limitationin case of complex buildings with homoge-neous texture. With reference to profile 2, thebuilding portions at different heights cannotbe distinguished in the panchromatic channel(white circle in Fig. 4 (a)), resulting in a sin-gle segment after partitioning (white circle inFig. 4 (b)), and a single height value for thatsegment in the enhanced DSM. A refining ofthe segmentation with additional information,like multispectral channels, could improve theidentification of sub-str uctures of the roofs atdifferent heights.

The statistics on the height differences be-tween the original and enhanced DSMs withrespect to the lidar one are summarized inTab. 1. The mean er ror, the mean absolute

3.4 Quality Analysis

For the quality analysis of the enhanced DSM,the lidar DSM was used as reference (Fig. 4(d)). We evaluated the height differencesthrough the surface profiles in dense urbanand industrial areas. Some examples are re-ported in Fig. 7.

With respect to the original DSM, the en-hanced one produces step profiles in cor re-spondence of buildings exter nal edges, as inprofile 2, and in some cases the algorithm de-tected buildings that were missing in the origi-nal DSM (profiles 1 and 5). The enhancementis not sensitive to the ter rain slope (profile 4).Obviously the accuracy of the enhanced DSMdepends on the accuracy of the initial DSM.Indeed small cor ridors between buildingsare well defined in the lidar DSMs, but not inDSM_C1; as a consequence the ground height

Fig. 6: Zoom on height gradients of ENH_MIN (left) and ENH_MAX (right): yellow = error de-creases, blue = error increases, white = error does not change.

Tab. 1: Statistics of height differences between the original and refined DSM with respect to lidaralong five transects (μ = mean value, MAD= mean absolute deviation, RMSE = root-mean-squareerror) (m).

Profile 1 Profile 2 Profile 3 Profile 4 Profile 5μ MAD RMSE μ MAD RMSE μ MAD RMSE μ MAD RMSE μ MAD RMSE

Original -1.64 2.91 3.78 -2.00 3.37 4.15 0.70 2.41 3.25 -1.22 2.59 3.21 - 4.21 4.86 5.88Refined -1.03 2.90 3.89 -1.18 2.08 2.77 0.40 1.90 1.87 -0.51 2.50 3.35 -2.87 4.06 4.92# points 37 70 140 209 74


Fig. 7: Profiles of original C1 stereo DSM (green), lidar DSM (red) and ENH (blue) along the fivetransects shown in the upper left image.


Fig. 8: Perspective 3D visualization of original DSM (DSM_C1, left) and enhanced DSM (ENH_HYB, right), both with vertical exaggeration 2.0, texture with C1 orthoimage (2.5 m).

Fig. 9: Perspective 3D visualization in wireframe mode of original DSM (DSM_C1, left) and en-hanced DSM (ENH_HYB, right), both with vertical exaggeration 2.0.

Fig. 10: Histogram of the local slope of lidar, original C1 stereo and enhanced DSM (logarithmicscale).


experiments of damage assessment using au-tomatic building detection in the original andenhanced DSMs.

Acknowledgements

Special thanks are given to the data providersfor the provision of the stereo datasets, name-ly: Euromap for the Cartosat-1 data, DigitalGlobe for the WorldView-1 data and ICC Cat-alunya for the reference data.

References

aReFi, H., 2009: From LIDAR Point Clouds to 3DBuilding Models. – PhD thesis, BundeswehrUniversity Munich.

baltsaVias, e.p., Mason, s. & stallMann, D.,1995: Use of DTM/DSMs and Or thoimages toSuppor t Building Extraction. – gRUen, a., kUe-bleR, o. & agoURis, p. (eds.): Automatic Extrac-tion of Man-Made Objects from Aerial andSpace Images. – Birk häuser Verlag, Basel, 199 –210 .

JaYnes, C.o., Collins, R.t., stolle, F.R., Hanson,a.R., RiseMan, e.M. & sCHUltz, H.J., 1996:Three-Dimensional Grouping for Site Modelingfrom Aerial Images. – Ar pa Image Understand-ing Workshop: 223–236.

HoRoWitz, s. & paVliDis, t., 1976: Picture segmen-tation by a tree traversal algorithm. – Jour nal ofthe ACM 23 (2): 368–388.

klaUs, a., soRMann, M. & kaRneR, k., 2006: Seg-ment-based stereo matching using belief propa-gation and a self-adapting dissimilarity meas-ure. – 18th Inter national Conference on Patter nRecognition 2006: 15–18.

kRaUss, t. & ReinaRtz, p., 2010: Enhancement ofdense urban digital surface models from VHRoptical satellite stereo data by pre-segmentationand object detection. – The Inter national Ar-chives of Photogrammetr y and Remote Sensingand Spatial Infor mation Sciences XXXVIII (1) .

li, Y., zHU, l. & sHiMaMURa, H., 2008: Integratedmethod of building extraction from digital sur-face model and imager y. – The Inter national Ar-chives of the Photogrammetr y, Remote Sensingand Spatial Infor mation Sciences XXXVII(B3b).

lU, Y., tRinDeR, J. & kUbik, k., 2002: Automaticbuilding extraction for 3D ter rain reconstr uctionusing inter pretation techniques. – Techniques,School of Sur veying and Spatial Infor mation

deviations and RMSE are decreasing for therefined DSMs, except for non-significant in-crease of the RMSE of profile 1.

The 3D visualizations of the original andenhanced DSMs in Figs. 8 and 9 show that af-ter the refinement, small blobs at the groundlevel are removed and the building edges areshar per and more realistic than in the origi-nal DSM. This improvement is confirmedby the analysis of the distribution of the lo-cal slope, calculated as the difference betweenthe maximum and minimum height within a 3by 3 neighbourhood. This analysis is car riedout over the lidar, C1 stereo, and enhancedDSMs, see Fig. 10. Indeed, the distribution ofthe slope of the original DSM gets substantial-ly closer to that of the lidar after enhancement,apart from a few outliers at the tail of the dis-tribution.

4 Conclusions

In this paper we presented an approach for theenhancement of digital surface models usingver y high resolution satellite images. Afterorthorectification, the images are segmentedthrough an advanced hierarchical image parti-tioning, then the segments are projected on theexisting DSM. The new height values are cal-culated for each segment with planar fitting orconstant (average) value of the original DSMheights contained in that segment. The meth-od was applied on a dataset over Catalonia,Spain, provided by ISPRS WG I/4 throughthe project “Benchmarking and quality anal-ysis of DEM generated from high and ver yhigh resolution optical stereo satellite data”.During this experiment a DSM was gener-ated by a Cartosat-1 stereopair and a World-View-1 (WV1) quasi-nadir scene was used forthe DSM enhancement. The original and finalDSMs were compared to the lidar DSM onthe same area for quality analysis. The anal-ysis of the profiles (with statistics), and localslopes and the 3D visualization shows thatin the enhanced DSM buildings contours areshar per and more realistic than in the originalone, which is favourable for automatic build-ing detection. Further investigations will alsoinclude the use of multispectral information toimprove the image segmentation and practical


soille, p., 2011: Preventing chaining through tran-sitions while favouring it within homogeneousregions. – ISMM 2011, Lecture Notes in Com-puter Science 6671: 96 –107.

tao, g. & YasUoka, Y., 2002: Combining high re-solution satellite imager y and airbor ne laserscanning data for generating bare land DEM inurban areas. – The Inter national Archives ofPhotogrammetr y and Remote Sensing and Spa-tial Infor mation Sciences XXXIV (5/W3), Kun-ming, China.

zHang, l., 2005: Automatic Digital Surface Model(DSM) Generation from Linear Ar ray Images.– PhD Disser tation, No. 88, Institute of Geodesyand Photogrammetr y, ETH Zurich, Switzerland.

zHang, l. & gRün, a., 2004: Automatic DSMgeneration from linear ar ray imager y data. – TheInter national Archives of the Photogrammetr y,Remote Sensing and Spatial Infor mation Sci-ences XXXV (3): 128–133.

Address of the Authors:

Dr.-Ing. Daniela poli, Dr.-Ing. habil. pieRRe soille,European Commission – Joint Research Centre, In-stitute for the Protection and Security of the Citi-zen, Global Security and Crisis Management Unit,Geo-Spatial Infor mation Analysis for Global Secu-rity and Stability (ISFER EA) Action, Via E. Fer mi2749, I-21027 Ispra (VA) Italy, TP 267, Tel. +39-0332-789241, -785068, Fax +39-0332-785154, e-mail: first name.last [email protected]

Manusk ript eingereicht: September 2011Angenommen: November 2011

Systems, University of New South Wales, NSW2052 Australia.

lU, Y., tRinDeR, J. & kUbik, k., 2006: AutomaticBuilding Detection Using the Dempster-ShaferAlgorithm. – Photogrammetric Engineering &Remote Sensing 72 (4): 395– 403.

MaiRe, C., 2010: Image Infor mation Extraction andModeling for the Enhancement of Digital Eleva-tion Models. – PhD Thesis, Karlsr uhe Instituteof Technology (KIT).

papaRoDitis, n., CoRD, M., JoRDan, M. & CoCqUeR-ez, J.p., 1998: Building Detection and Recon-str uction from Mid- and High-Resolution AerialImager y. – Computer Vision and Image Under-standing 72 (2): 122–142.

pesaResi, M. & beneDiktsson, J.a., 2001: A newapproach for the mor phological segmentation ofhigh-resolution satellite imager y. – IEEE Trans-actions on Geoscience and Remote Sensing 39(2): 309 –320.

poli, D. & soille, p., 2011: Refinement of digitalsurface models through constrained connectivi-ty par titioning of optical imager y. – The Inter na-tional Archives of Photogrammetr y and RemoteSensing and Spatial Infor mation Sciences XXXVIII (4/W19).

ReinaRtz, p., D’angelo, p., kRaUss, t., poli, D., Ja-Cobsen, k. & bUYUksaliH, g., 2010: Benchmark-ing and quality analysis of DEM generated fromhigh and ver y high resolution optical stereo sat-ellite data. – The Inter national Archives of Pho-togrammetr y and Remote Sensing and SpatialInfor mation Sciences XXXVIII (1) .

soille, p., 2008: Constrained connectivity for hier-archical image par titioning and simplification. –IEEE Transactions on Patter n Analysis and Ma-chine Intelligence 30 (7): 1132–1145.

digital surface model extraction and refinement through

Documents