changes introduced as a result of publishing processes...

This is the author’s version of a work that was submitted/accepted for pub-lication in the following source:

Guo, Xufeng, Dean, David B., Denman, Simon, Fookes, Clinton B., & Srid-haran, Sridha(2011)Evaluating automatic road detection across a large aerial imagery collec-tion. InProceedings of the 2011 International Conference of Digital Image Com-puting: Techniques and Applications, IEEE, Sheraton Noosa Resort &Spa, Noosa, QLD, pp. 140-145.

This file was downloaded from: http://eprints.qut.edu.au/47715/

c© Copyright 2011 IEEE

Personal use of this material is permitted. However, permission toreprint/republish this material for advertising or promotional purposes orfor creating new collective works for resale or redistribution to servers orlists, or to reuse any copyrighted component of this work in other worksmust be obtained from the IEEE.

Notice: Changes introduced as a result of publishing processes such ascopy-editing and formatting may not be reflected in this document. For adefinitive version of this work, please refer to the published source:

http://eprints.qut.edu.au/view/person/Guo,_Xufeng.html

http://eprints.qut.edu.au/view/person/Dean,_David.html

http://eprints.qut.edu.au/view/person/Denman,_Simon.html

http://eprints.qut.edu.au/view/person/Fookes,_Clinton.html

http://eprints.qut.edu.au/view/person/Sridharan,_Sridha.html

http://eprints.qut.edu.au/view/person/Sridharan,_Sridha.html

http://eprints.qut.edu.au/47715/

Evaluating Automatic Road Detection Across aLarge Aerial Imagery Collection

Xufeng Guo, David Dean, Simon Denman, Clinton Fookes, Sridha SridharanImage and Video Research Laboratory, Queensland University of Technology, Brisbane, Australia

Email: [email protected], [email protected], {s.denman, c.fookes, s.sridharan}@qut.edu.au

Abstract—The automated extraction of roads from aerialimagery can be of value for tasks including mapping, surveil-lance and change detection. Unfortunately, there are no pub-lic databases or standard evaluation protocols for evaluatingthese techniques. Many techniques are further hindered by areliance on manual initialisation, making large scale applicationof the techniques impractical. In this paper, we present apublic database and evaluation protocol for the evaluation ofroad extraction algorithms, and propose an improved automaticseed finding technique to initialise road extraction, based on acombination of geometric and colour features.

I. INTRODUCTION

As the amount of aerial data being captured continues toincrease, it is important to develop techniques to automaticallyprocess this data. Roads and road networks are key features inaerial imagery of built-up areas, and the automatic extractionof these features can be valuable for tasks such as mapping,change detection, and surveillance. At present, there is nostandard evaluation methodology or database for evaluatingroad extraction techniques across a wide variety of regions,with most algorithms presented on a small set (typically lessthan a dozen) of images. This fragmentation of aerial imagerydatasets makes it difficult to develop, benchmark and comparealgorithms. Furthermore, many existing techniques requiremanual initialisation, making such large scale evaluationsimpractical.

The article attempts to address both these limitations by:1) introducing a public evaluation database, consisting of 300locations at 3 different resolutions, covering many differentterrain types (suburban, rural, forest, river and ocean), differentweather and illumination conditions; and 2) proposing animproved road network seed detection algorithm, for theautomated extraction of road networks.

II. EVALUATING ROAD DETECTION ALGORITHMS

A. Aerial imagery collection

In order to allow for our database to be easily distributedto other researchers, we choose to collect our aerial imageryfrom the Australian aerial imagery company NearMap undertheir Community License1 which allows for distribution un-der the Creative Commons Attribution Share Alike (CC-BY-SA) license. High resolution aerial imagery was downloadedfrom NearMap’s servers from 300 randomly chosen locationswithin the greater South-East Brisbane region, bounded by

1http://www.nearmap.com/products/community-licence-overview

Figure 1. Aerial imagery was collected in 300 randomly selected locations(shown as purple squares) across the greater south-east Brisbane region.[Image CC-BY-SA OpenStreetMap Contributors and QUT]

(27.40◦S, 152.95◦E) and (27.97◦S, 153.44◦E), as shown inFig. 1. By choosing these locations randomly across a widearea, the database provides a wide variety of urban, suburban,rural and bush areas, including a significant subset of tiles thatdo not have any roads at all.

Each of the 300 regions chosen from the database wereapproximately 541m× 541m, or the area corresponding to asingle zoom 16 tile available from NearMap’s aerial imageryservers. By choosing a single zoom 16 tile for each location,high resolution imagery can easily be obtained in the same lo-cation by collecting the zoom 17 or 18 tiles that also constructthat location, as illustrated in Fig. 2. Having three distinct

Figure 2. NearMap aerial imagery is available in 256 × 256 pixel tiles.Each location was collected as a single zoom-16 tile, 4 zoom-17 tiles, and 16zoom-18 tiles.

zoom levels available allows for road-detection algorithms toeasily be evaluated at multiple scales in the same location.

NearMap imagery is also available at different dates formany regions in the coverage area, and while we plan toinclude multiple dates for each location in future datasets, atthis stage we have simply chosen the latest date available foreach location. However, as the entire database region was notflown in a single flight, the database as it exists at presentdoes have some variation due to differences in time-of-dayand weather during the separate flights.

B. Reference road network

While limited road detection evaluations can be performedby inspection, large scale evaluation of road detection algo-rithms require a reference road network to serve as a pointof reference for the extracted road network. For our referenceroad network we chose to use CC-BY-SA licensed street dataprovided by the OpenStreetMap project2. A local extract of thedatabase region was downloaded at the time of constructingthe database (May 2011) and kept locally to ensure that thereference data would not change. The CC-BY-SA licenseof the OpenStreetMap project will allow the reference roadnetwork to be easily shared with other researchers alongsidethe similarly licensed NearMap imagery.

Because the OpenStreetMap database contains featuresthat are not presently of interest to our road detectionalgorithms (such as footpaths, parks, buildings, etc.) wefiltered the local extract to only contain objects thatwere likely to be paved roads suitable for automotiveusage. In particular, objects tagged as a highway withthe values motorway(_link), trunk(_link),primary(_link), secondary(_link), tertiary,residential, unclassified, living_street,service or pedestrian were retained in a local road-only reference dataset. This road-only dataset was thenconverted to a 1-pixel wide skeleton image matched with

2http://www.openstreetmap.org

each location and scale in the evaluation database for lateruse in performance metric calculation.

C. Performance metrics calculation

In order to allow for a comprehensive evaluation of roaddetection algorithms on our aerial imagery database, we havechosen to use the completeness, correctness and quality mea-sures first proposed by Harvey [1], and defined as follows.

The completeness (Cp) of a road network is the percentageof the reference road network that is successfully extracted bythe road detection algorithm, and can be defined as

Cp =Lmr

Lr, (1)

where Lmr is the length of the matched reference, and Lr

is the length of the reference (for a given image). If there isno reference road network, the completeness is assumed to be100%.

The correctness (Cr) of a road network is the percentageof the extracted road network that is matched by the referencenetwork, and can be defined as

Cr =Lme

Le, (2)

where Lme is the length of matched extraction, and Le is thelength of the extracted road network (for a given image). Ifthere is no extracted road network, the correctness is assumedto be 100%.

Finally, the quality (Q) of a road network measures thecontribution of the matched roads to the entire extracted andreference road network (where 100% implies that the entirenetwork is matched), and can be defined as

Q =Lme

Le + Lr − Lmr. (3)

Similarly to the correctness and completeness, if there is noreference and extracted road network in a given image, thequality is assumed to be 100%.

While the calculation of the total length of an extracted (Le)and/or reference (Lr) road network can easily be performedby calculating the total number of pixels in a skeleton networkimage, the calculation of the matched lengths Lme and Lmr

require matching road segments to be first identified. For ourevaluations, the ‘buffer method’ [2], illustrated in Fig. 3, wasused to match the two road networks. By using a dilationbuffer around the reference or extracted road network, andintersecting with its counterpart, the matched and unmatchedportions of the network can easily be calculated at a pixel-level for use in metrics calculation. Dilation was performedusing a line structural element, arranged perpendicular to theroad being dilated, with the structural length being 5, 10 or20 pixels at zoom levels 16, 17 and 18 respectively.

III. ROAD DETECTION ALGORITHM

A. Existing approaches

Gruen and Li [3] suggest that typical road extraction pro-cedure can be divided into three stages: image pre-processing,

(a) Matched extraction (Lme)

(b) Matched reference (Lmr)

Figure 3. Calculation of the length of matched extraction network (Lme)and the length of matched reference network (Lmr) is performed by dilatingthe counterpart network and taking the intersection.

road finding and road following. Pre-processing typicallyconsists of steps such as colour conversion [5], normalisation[4], or sharpening [3].

Road finding is performed through seed detection. A seedis a point in the image that has a high likelihood of beinga road, and these points are the starting points for growingthe road network. The majority of existing systems requiremanual seeding [6], [3], [1], [7], which is time consuming andprevents the system being automatic. Automatic seed detectiontechniques are proposed in [5], [8], [9]. Christophe et al. [5]and Laptev et al. [8] use line detection to locate the roadedge and road body respectively. Hu et al. [9] assesses therectangularity of regions surrounding potential seed pointsusing a ‘spoke wheel’ operator. These techniques [5], [8], [9]however are all prone to error when encountering obstacles onthe road such as overhanging trees, shadows or vehicles.

Road following typically seeks to extract one or morefeatures to continuously detect, as the road network is grown.Baumgartner and Hinz [6] rely on colour, using a colour profileobtained during a manual initialisation to predict and grow theroad network. Laptev et al. [8] use a ribbon snake to combineintensity and texture to follow the road, while Christophe etal. [5] rely on gradient information. Hu et al. [9] grow theroad network through repeated application of the spoke wheeloperator (see Fig. 4) to obtain a road footprint. The spokewheel is iteratively applied at the peaks of the previouslyextracted footprints, until the final extracted footprint yieldsno more children.

B. Proposed road detection algorithm

In this paper we propose a modified version of the al-gorithm presented by Hu et al [9]. The proposed approachfirst performs an automatic seed detection, followed by roadextraction. The extracted road network can be converted to a

(a) (b) (c)

Figure 4. An example road footprint is found by (a) laying a spoke operatorover the road network, (b) taking the intersection of the spokes with the road-edges, and (c) finding the maximum-distance points for continuing the roadnetwork.

vector format for comparison with the ground truth.Like [9], the proposed approach relies on the detection of

footprints to extract the road network. The footprint of a pixeldescribes the geometric characteristics of the local area, suchas its rectangularity and orientation, and is determined using aspoke wheel operator [9]. As shown in Fig. 4, a road footprintis extracted by creating a spoke wheel, W, with N spokes ofradius M , and centred at point P, denoted as W (P, N,M).For the system presented here a N of 64 and an on-groundspoke length of 10 metres was chosen.

The intersection of the spoke wheel with the edge of theroad network, Ci, is determined for each spoke i as the firstpoint from the centre out that meets the requirement

|I(Ci)− I(p)| ≥ k × σ (W (P, N,M)) (4)

where I (x) is the intensity of pixel x, σ (W (P, N,M)) isthe standard deviation of the intensity of all pixels on wheelW, and k is used to tune the threshold. A larger value of k willallow more flexibilty in ignoring obstacles in the road network,but introduces the risk of including off-road areas. For thesystem presented in this paper, k was determined empiricallyto be 0.5 on a small subset of evaluation images.

If no intersection is found for a particular spoke, then Ci isset to the end of the spoke, and the final footprint is formedby joining all individual Ci points, as shown in Fig. 4(b).

The proposed approach performs automatic seed detectionto initialise the road extraction process. Ideally, the seed pointsshould be guaranteed to be on a road, so the road networkcan then be grown from these seed points. Conversely, roadsegments that do not contain a seed point may not be detected;therefore each road segment should also have at least one seedpoint.

We extend the seed detection process of [9], which wasbased purely on a footprint rectangularity measure, by incorpo-rating additional saturation and network expansion constraints.Firstly, random points within the saturation layer for the HSIaerial image are chosen as possible seed locations. Thesepoints are filtered through three processes: a saturation thresh-old, a footprint rectangularity test, and a network expansiontest. The saturation threshold removes the seeds with saturationintensity likely to indicate non-road areasbe off the road (seeFig. 5(b)). Similar to [9], the rectangularity test is conductedusing the minimal oriented bounding box [10] to eliminate

(a) Example aerial image (b) Saturation image

(c) Seed points remaining after sat-uration and rectangularity tests

(d) Seed point remaining after net-work expansion tests

(e) The union of all footprints in thedetected network

(f) Skeletonised road network

Figure 5. An overview of the proposed road detection algorithm on anexample location. [Images CC-BY-SA NearMap and QUT]

footprints with a non-rectangular shape. An example of theremaining seeds after saturation and rectangularity test isshown in Fig. 5(c).

The remaining seed points are tested for network expansionthrough a potential test and a mean stretch distance (MSD)test. The potential test measures how many generations a seedfootprint can grow a road network. All seeds with a potentialless than 4 are discarded. Once a seed has shown that it cangrow for at least 4 generations, the MSD is calculated as themean distance of each end point from the original seed, andany seeds with an on-ground MSD of less than 20 metres arerejected. An example of the remaining seeds after the networkexpansion tests is shown in Fig. 5(d).

The final set of seed points is then used to grow the roadnetwork, using the footprint to propagate the network. From[9], the peaks of the footprint indicate the directions of theroad, hence the algorithm finds the footprint peaks and uses

Table IAVERAGE SYSTEM PERFORMANCE OVER THE PROPOSED DATABASE FOR

THREE DIFFERENT ZOOM LEVELS.

Zoom Level Cp Cr Q16 50.91% 47.90% 32.77%17 63.03% 46.58% 36.58%18 68.03% 54.50% 43.39%

these points as the center of the next footprint. This processcontinues until there are no peaks in the footprint, or the nextfootprint overlaps an existing footprint. For example, Fig. 4(c)shows the points v1, v2 and v3 as the peak vertices of thefootprint.

Once the road network has completed growing, a minimalpruning process is conducted by removing all footprints thathave a mean saturation value < 50. This approach helped toremove many false detection events, such as within buildings,but more sophisticated pruning techniques will be investigatedin future research.

The remaining road footprints are then combined to forma binary image indicating the presence or otherwise of theroad network, as shown in Fig. 5(e). Finally, a sequenceof opening and closing morphological operations were usedto remove unwanted spikes and re-connect broken and dis-connected segments, following by a skeletonisation operationto get a rough 1-pixel wide road network, as shown inFig. 5(f). Further simplification of this network was performedusing the Douglas-Peucker [11] algorithm, and stored for latercomparison with the reference network.

IV. EVALUATION RESULTS AND DISCUSSION

By evaluating the proposed system against the evaluationprotocol outlined in Section II we obtained extracted streetnetworks at three different zoom levels for each of the 300evaluation locations.

The completeness (Cp), correctness (Cr) and quality (Q)for each location and zoom-level were then calculated, andevaluation images for each location were created showing acolour-coded road network over the aerial imagery indicatingwhich sections correspond to matched extraction, false extrac-tion and missed reference.

A summary of the quality metric across every location inthe database at zoom 18 is shown in Fig. 6. By looking atthe locations with Q > 20% (orange and green) it is clearthat the proposed algorithm works best in the urban areas ofthe map, with few red locations (Q < 20%) in these areas.While there are some green locations (Q > 60%) outside ofthe urban areas, they typically do not contain any reference orextracted roads (and therefore get 100% quality).

A summary of the road detection performance at each zoomlevel over the entire database is shown in Table I, and exampleevaluation images of two urban regions at each of the zoomlevels is also shown in Fig. 7. These results show that, aswould be expected, better performance can be obtained as theimage resolution is increased.

Figure 6. An overview of the quality (Q) scores of the proposed systemacross all locations at Zoom 18. Labels indicate example images used in theremainder of this paper in order of appearance. [Key: Q < 20%, 20% ≤Q ≤ 60%, Q > 60%. Image CC-BY-SA OpenStreetMap Contributors andQUT]

To further illustrate both the performance of the proposedroad detection algorithm and the variety of locations availablein the aerial imagery database, a number of example evaluationimages are shown for both high quality extraction and low-to-medium quality extraction in Figs. 8 and 9 respectively.

It can be clearly seen in Fig. 9 that the proposed roaddetection algorithm has problems with falsely detecting similarlinear features as roads in (a) and (b), and has problems withdetecting roads under tree cover in (c) and (d). Fig. 9(e) is ofparticular interest as it shows roads that are actually on theground, but are not reflected in the ground truth, and similarcomments can be made about the car park lanes in (f), althoughthe large amounts of false detection on the buildings are stilla considerable problem here.

V. CONCLUSION AND FUTURE WORK

In this article, we have presented a database and evalua-tion methodology for evaluating road extraction from aerialimagery, using publicly available aerial imagery and roadnetwork data. We have used this database to evaluate ourproposed road extraction system based on Hu et al [9]. Weintroduce additional constraints in the automatic selection of

(a) Zoom 16 (Q = 51.5%) (b) Zoom 16 (Q = 22.3%)

(c) Zoom 17 (Q = 69.6%) (d) Zoom 17 (Q = 67.9%)

(e) Zoom 18 (Q = 84.9%) (f) Zoom 18 (Q = 80.9%)

Figure 7. Road detection performance of location A and B at zoom levels16, 17 and 18. [Key: matched extraction, false extraction, missed reference.All images CC-BY-SA NearMap and QUT]

seeds (colour, and additional geometric constraints) to improvethe quality of seeds, and thus the overall quality of theextracted road network. Future work will focus on usingthe proposed database to apply machine learning methodsto the task of road extraction, to automatically learn modelparameters and improve road detection performance.

Researchers interested in obtaining a copy of this aerialimagery collection to compare performance should get incontact with the final author using the email address provided.

ACKNOWLEDGEMENTS

The authors would like to thank both NearMap and theOpenStreetMap community for providing their data under theCreative Commons By Atribution Share Alike (CC-BY-SA)license.

(a) Location C (Q = 82.7%) (b) Location D (Q = 79.5%)

(c) Location E (Q = 85.5%) (d) Location F (Q = 75.2%)

(e) Location G (Q = 74.3%) (f) Location H (Q = 84.8%)

Figure 8. Examples of high quality (Q > 60%) road detection at zoom level18. [Key: matched extraction, false extraction, missed reference. All imagesCC-BY-SA NearMap and QUT]

REFERENCES

[1] W. Harvey, “Performance evaluation for road extraction,” Bull. Soc.Franc Photogramm et. TeledBull. Soc. Franc Photogramm et. Telede-tection 153, 1999.

[2] C. Wiedemann, C. Heipke, H. Mayer, and O. Jamet, “Empirical eval-uation of automatically extracted road axes,” in Empirical EvaluationTechniques in Computer Vision, 1998, pp. 172–187.

[3] A. Gruen and H. Li, “Road extraction from aerial and satellite images bydynamic programming,” ISPRS Journal of Photogrammetry and RemoteSensing, vol. 50, no. 4, pp. 11 – 20, 1995.

[4] M. Mokhtarzade and M. V. Zoej, “Road detection from high-resolutionsatellite images using artificial neural networks,” International Journalof Applied Earth Observation and Geoinformation, vol. 9, no. 1, pp. 32– 40, 2007.

[5] E. Christophe and J. Inglada, “Robust road extraction for high reso-lution satellite images,” in Image Processing, 2007. ICIP 2007. IEEEInternational Conference on, vol. 5, 162007-oct.19 2007, pp. V –437–V –440.

[6] S. W. C. Baumgartner, A. Hinz, “Efficient methods and interfaces forroad tracking,” INTERNATIONAL ARCHIVES OF PHOTOGRAMME-

(a) Location I (Q = 0.0%) (b) Location J (Q = 0.0%)

(c) Location K (Q = 29.5%) (d) Location L (Q = 43.8%)

(e) Location M (Q = 34.7%) (f) Location N (Q = 14.3%)

Figure 9. Examples of medium and low quality (Q ≤ 60%) road detectionat zoom level 18. [Key: matched extraction, false extraction, missed reference.All images CC-BY-SA NearMap and QUT]

TRY REMOTE SENSING AND SPATIAL INFORMATION SCIENCES,vol. 34, pp. 28–31, 2002.

[7] J. McKeown, D.M. and J. Denlinger, “Cooperative methods for roadtracking in aerial imagery,” in Computer Vision and Pattern Recognition,1988. Proceedings CVPR ’88., Computer Society Conference on, Jun.1988, pp. 662 –672.

[8] I. Laptev, T. Lindeberg, W. Eckstein, C. Steger, and A. Baumgartner,“Automatic extraction of roads from aerial images based on scale-spaceand snakes,” 2000.

[9] J. Hu, A. Razdan, J. Femiani, M. Cui, and P. Wonka, “Road networkextraction and intersection detection from aerial images by trackingroad footprints,” Geoscience and Remote Sensing, IEEE Transactionson, vol. 45, no. 12, pp. 4144 –4157, dec. 2007.

[10] P. L. Rosin, “Measuring rectangularity,” Machine Vision and Applica-tions, vol. 11, pp. 191–196, 1999, 10.1007/s001380050101.

[11] D. H. Douglas and T. K. Peucker, “Algorithms for the reduction of thenumber of points required to represent a digitized line or its caricature,”Cartographica: The International Journal for Geographic Informationand Geovisualization, vol. 10, no. 2, pp. 112–122, Oct. 1973. [Online].Available: http://dx.doi.org/10.3138/FM57-6770-U75U-7727

changes introduced as a result of publishing processes...

Documents