automatic road extraction from aerial images

10
Automatic Road Extraction from Aerial Images John C. Trinder andYandong Wang School of Geomatic Engineering, The University of New South Wales, Sydney, New South Wales 2052,Australia John C. Trinder and Yandong Wang. Automatic Road Extraction from Aerial Images, Digital Signal Processing 8 (1998), 215–224. The paper presents a knowledge-based method for auto- matic road extraction from aerial photography and high- resolution remotely sensed images. The method is based on Marr’s theory of vision, which consists of low-level image processing for edge detection and linking, mid-level processing for the formation of road structure, and high- level processing for the recognition of roads. It uses a combined control strategy in which hypotheses are gener- ated in a bottom-up mode and a top-down process is applied to predict the missing road segments. To describe road structures a generalized antiparallel pair is proposed. The hypotheses of road segments are generated based on the knowledge of their geometric and radiometric proper- ties, which are expressed as rules in Prolog. They are verified using part–whole relationships between roads in high-resolution images and roads in low-resolution images and spatial relationships between verified road segments. Some results are presented in this paper. r 1998 Academic Press Key Words: computer vision; digital photogrammetry; feature extraction; image understanding; knowledge base; object recognition. 1. INTRODUCTION Automatic road extraction from remotely sensed imagery has been an active research area in com- puter vision and digital photogrammetry for over two decades. During the past 20 years, a number of semiautomatic and automatic methods and algo- rithms for road extraction have been developed. Conventional methods of road extraction usually consist of three main steps, road finding, road track- ing, and road linking. In road finding, local proper- ties of the image are tested and road candidates are found using certain criteria. The detected road candi- dates are then traced to form road segments. The separated road segments are finally linked to gener- ate a road network using geometric constraints. In semiautomatic road extraction, a road in the image is delineated using its geometric and photometric properties with the initial positions provided by an operator [1–4]. These methods use local geometric constraints for road tracking and linking. Because the global structure of the road network is not considered, wrong segments are unavoidable, and occlusions such as trees, shadows, surface anoma- lies, and road width change can cause the tracking to be lost. In recent years, the active contour models (snakes) technique has received considerable atten- tion [5–9]. In snakes, a linear feature in the image is modeled by energy which is expressed by geometric and photometric constraints. The extraction of fea- tures is performed by optimizing the total energy. One important characteristic of snakes is that the geometric constraints are directly used to guide the search for the feature. And as the road seeds are given along the roads to be extracted, the overall structure of the road network is well defined. This can ensure that the extracted roads are reliable and accurate. Automatic approaches pursue automatic location of a road in the image by recognizing the road and defining its positions accurately. Much effort in the existing methods has been made in automatic deter- mination of the starting point or segment of a road using the knowledge on local photometric properties of the road point or segment such as the intensity value and contrast [10–13]. Local photometric prop- erties cannot classify a point or segment as a road point or segment reliably, since they depend on many factors, such as the season, time and weather condi- tion of photography. The same object in different images may have different gray values, while differ- DIGITAL SIGNAL PROCESSING 8, 215–224 (1998) ARTICLE NO. SP980322 1051-2004/98 $25.00 Copyright r 1998 by Academic Press All rights of reproduction in any form reserved. 215

Upload: john-c-trinder

Post on 15-Jun-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Automatic Road Extraction from Aerial Images

Automatic Road Extraction from Aerial ImagesJohn C. Trinder and Yandong WangSchool of Geomatic Engineering, The University of New South Wales,Sydney, New South Wales 2052, Australia

John C. Trinder and Yandong Wang. Automatic RoadExtraction from Aerial Images, Digital Signal Processing 8(1998), 215–224.

The paper presents a knowledge-based method for auto-matic road extraction from aerial photography and high-resolution remotely sensed images. The method is basedon Marr’s theory of vision, which consists of low-levelimage processing for edge detection and linking, mid-levelprocessing for the formation of road structure, and high-level processing for the recognition of roads. It uses acombined control strategy in which hypotheses are gener-ated in a bottom-up mode and a top-down process isapplied to predict the missing road segments. To describeroad structures a generalized antiparallel pair is proposed.The hypotheses of road segments are generated based onthe knowledge of their geometric and radiometric proper-ties, which are expressed as rules in Prolog. They areverified using part–whole relationships between roads inhigh-resolution images and roads in low-resolution imagesand spatial relationships between verified road segments.Some results are presented in this paper. r 1998 Academic Press

Key Words: computer vision; digital photogrammetry;feature extraction; image understanding; knowledge base;object recognition.

1. INTRODUCTION

Automatic road extraction from remotely sensedimagery has been an active research area in com-puter vision and digital photogrammetry for overtwo decades. During the past 20 years, a number ofsemiautomatic and automatic methods and algo-rithms for road extraction have been developed.Conventional methods of road extraction usuallyconsist of three main steps, road finding, road track-ing, and road linking. In road finding, local proper-ties of the image are tested and road candidates are

found using certain criteria. The detected road candi-dates are then traced to form road segments. Theseparated road segments are finally linked to gener-ate a road network using geometric constraints. Insemiautomatic road extraction, a road in the imageis delineated using its geometric and photometricproperties with the initial positions provided by anoperator [1–4]. These methods use local geometricconstraints for road tracking and linking. Becausethe global structure of the road network is notconsidered, wrong segments are unavoidable, andocclusions such as trees, shadows, surface anoma-lies, and road width change can cause the tracking tobe lost. In recent years, the active contour models(snakes) technique has received considerable atten-tion [5–9]. In snakes, a linear feature in the image ismodeled by energy which is expressed by geometricand photometric constraints. The extraction of fea-tures is performed by optimizing the total energy.One important characteristic of snakes is that thegeometric constraints are directly used to guide thesearch for the feature. And as the road seeds aregiven along the roads to be extracted, the overallstructure of the road network is well defined. Thiscan ensure that the extracted roads are reliable andaccurate.

Automatic approaches pursue automatic locationof a road in the image by recognizing the road anddefining its positions accurately. Much effort in theexisting methods has been made in automatic deter-mination of the starting point or segment of a roadusing the knowledge on local photometric propertiesof the road point or segment such as the intensityvalue and contrast [10–13]. Local photometric prop-erties cannot classify a point or segment as a roadpoint or segment reliably, since they depend on manyfactors, such as the season, time and weather condi-tion of photography. The same object in differentimages may have different gray values, while differ-

DIGITAL SIGNAL PROCESSING 8, 215–224 (1998)ARTICLE NO. SP980322

1051-2004/98 $25.00Copyright r 1998 by Academic Press

All rights of reproduction in any form reserved.

215

Page 2: Automatic Road Extraction from Aerial Images

ent objects in one image may have similar intensity.Recently, some knowledge-based methods using arti-ficial intelligence techniques have been developed[14,15]. These methods use various types of knowl-edge about roads and the world and inference mecha-nisms to extract the road network. In this paper, aknowledge-based method for automatic road extrac-tion from aerial images and high-resolution remotelysensed imagery is proposed.

2. ROAD MODEL FOR RECOGNITION

To recognize an object automatically, it is neces-sary to define a semantic model of the object whichcan be implemented by computer. The model shouldinclude the definition of object classes, object-specificproperties, and relationships between the objectclasses to be recognized [16]. For road extraction, theroad model to be defined should cover the definitionof road parts, geometric and photometric propertiesof a road, and relationships between road parts. Theimage scale must also be taken into account becausethe appearance of a road in the image depends on theimage scale. In small scale images, a road appears asa line feature while a road intersection is a pointfeature. In medium scale images, a road segment is ahomogeneous area bounded by two parallel bound-aries within which the properties of the surface aremeasurable and lane lines may be visible. In thesystem described in this paper, a road network isrepresented in two different resolutions of images,i.e., low- and high-resolutions. In high-resolutionimages, a road segment is defined as an elongatedarea bounded by two parallel edges, with specificproperties which can be obtained from the manualsof road construction and spectral properties of roadsurface. Road junctions are also homogeneous areas,but have different shapes, e.g., T and Y junctions. Inlow-resolution images, a road network is repre-sented by lines and points which are connected toeach other. Roads in high-resolution images can beconsidered as parts of roads in low-resolution im-ages. Thus, they can be related by part–whole rela-tionships. A road junction in a high-resolution imageis treated as the specialization of a road intersectionin a low-resolution image, and they are related bythe relationship of specialization–generalization.

3. AUTOMATIC ROAD EXTRACTION

According to Marr’s theory of vision [17], theoverall framework for visual information processing

consists of three major steps: (1) feature detection togenerate the primal sketch (low-level processing),which is a collection of structural features of objectsderived from a 2-D image; (2) grouping to producethe 2.5-D sketch (mid-level processing), a descriptionof the object structure in object space; and (3)recognition to derive the 3-D model representation ofthe object’s geometry (high-level processing).

It is believed that manmade objects usually con-sist of distinct points, lines, and regions. Thesefeatures are related to each other and form thestructure of the objects. In low-level processing, thesefeatures are extracted using different feature extrac-tion algorithms. They can be extracted by a singleoperator, e.g., a polymorphic operator [18], or byseveral operators separately. At this stage, no knowl-edge of the objects is used.

The extracted points, edges, and areas in low-levelprocessing are unstructured features of objects. Forrecognition, they need to be grouped to form thestructure of the objects in mid-level processing. Thereare a number of techniques for grouping. One of thepopular techniques used in computer vision is percep-tual grouping, which is based on the principles ofGestalt laws. In perceptual grouping, features aregrouped based on some geometric assumptions. Re-cently Fuchs and Forstner [19] presented polymor-phic grouping based on the extracted points, lines,and areas and their topological relationships, andHenricsson [20] developed similarity grouping byusing the chromatic attributes of the features for theextraction of buildings.

High-level processing uses the structure, associ-ated attributes and relationships produced in mid-level processing and models stored in a database togenerate the interpretation of objects. This process iscalled ‘‘indexing’’ in computer vision. In knowledge-based systems, objects are interpreted using theknowledge stored in the knowledge base. The essen-tial part of the knowledge base is the informationabout geometric and radiometric properties of theobjects, because these are the basic information forrecognition. The a priori knowledge about the imagesystem, knowledge of the object’s topology, context,etc., may also be included in the knowledge base.

There are four different control strategies, bottom-up, top-down, hybrid, and heterarchical control strat-egies [21]. The bottom-up, also called data-driven,strategy starts with low-level processing to extractobject features. The object is recognized after itsstructure and relationships are formed. The top-down, also called goal-directed strategy, proceedsfrom hypothesis generation to finding instances byinference, while the hybrid system combines both

216

Page 3: Automatic Road Extraction from Aerial Images

data-driven and goal-directed control strategies[14,15,22–24]. In heterarchical control, differentknowledge sources are viewed as cooperating andcompeting experts, and the expert who can help mostin finding the solution of the subtask is selected. Themethod introduced in the paper uses a hybrid controlstrategy in which hypotheses are generated in abottom-up mode and a top-down procedure is appliedto verify the generated hypotheses.

3.1. Edge Detection and LinkingThere exist a large number of edge detectors for

aerial images and remotely sensed imagery. A goodsurvey of existing methods and algorithms for bound-ary detection can be found in [25]. In our implemen-tation, the Canny operator [26] and SE operator [27]are applied to 2-D images to extract edge informa-tion. The extracted edge points are tracked by follow-ing the maximum gradient in the edge image. Edgeswith a gap of less than 5 pixels and a difference indirection of less than 30° are bridged [28]. Afterlinking, a split-and-merge process is applied to alledge segments. An edge segment is split when themaximum distance of its edge points to the connec-tion of its two end points is larger than a given value.This process is repeated until the maximum distanceis under the threshold [29]. Each split edge segmentis approximated by a second order polynomial. Twoneighboring edge segments are merged if their differ-ence in spatial direction is small. Finally, shortsegments (,5 pixels) are treated as noise and re-moved.

3.2. Generalized Antiparallel PairsIn high-resolution aerial photographs, a road has

two parallel boundaries with opposite gradient direc-tions. Nevatia and Babu [28] use antiparallel lines(apars) to describe a road segment. In their method,a road is assumed to be a straight linear feature. Inreality, road boundaries are not always straightlines, but rather are smooth curves. Therefore, todescribe the structure of a road, generalized antipar-allel pairs, briefly called antiparallel pairs, are pro-posed. A generalized antiparallel pair is defined astwo smooth curves which are pointwise parallel toeach other and across which the gradients of localintensity have opposite signs.

3.2.1. Generation of antiparallel pairs. The gen-eration of antiparallel pairs starts with the computa-tion of spatial directions of the segments. Segmentswith a difference in spatial direction less than 30°are selected. The gradient direction of a segment isdefined by the difference of average intensities onboth sides and has a value of 1 for inward gradient

direction and 21 for outward gradient direction. Toensure parallelism, the standard deviation of thedistance between two segments is computed andused as another criterion. After the antiparallelpairs are generated, they are represented symboli-cally in terms of a number of attributes whichinclude the positions of two boundaries L1 and L2,directions of both ends (aa, ab), average intensity,gradient direction, and length and width of theantiparallel pair (Fig. 1) [11]. L1 and L2 are twopoint chains which define the positions of two bound-aries. The end direction is defined as the normal tothe connection between two end points at the sameend. Although they can be determined from thepositions of two end points, end directions are stillincluded in the attribute list in order to facilitatesubsequent processing. An antiparallel pair is ex-pressed as

antipair(antipair_no, left_side(L1), right_side(L2),attribute(end_directions, length, width,gradient, average_intensity)).

3.2.2. Grouping of antiparallel pairs. Due to theexistence of image noise, occlusions, such as shadowscast by trees, and low contrast between the roadsurface and its background, the generated antiparal-lel pairs are usually separated. These separatedantiparallel pairs need to be grouped to form road-like features before recognition is performed. Mostexisting methods for grouping are based on theGestalt laws, which use one or more properties ofperceptual organization, e.g., proximity and collinear-ity. These methods can work well when the objects tobe interpreted have regular shapes, e.g., rectangu-lar, U-shaped structures. However, they may yieldunsatisfactory results when objects have complexstructures, especially when there are disturbancesaround the objects. Another problem with thesemethods is the selection of thresholds for geometric

FIG. 1. A generalized antiparallel pair.

217

Page 4: Automatic Road Extraction from Aerial Images

constraints.Aroad usually has homogeneous geomet-ric and radiometric properties along its surface.Therefore, the geometric and radiometric propertiesof an antiparallel pair are used in the grouping ofantiparallel pairs. Two antiparallel pairs are takenas direct neighbors belonging to the same road ifthey satisfy the following conditions:

a. the same road widthb. the same gradient directionc. similar gray scale valuesd. small difference in heighte. small difference in spatial directionf. distance in space within a given threshold.

These conditions are used for the grouping ofantiparallel pairs and formulated as a rule in Prolog:

connect(X, Y):-relation(X, Y,attribute(distance,difference_in_direction,difference_in_height,difference_in_gradient,difference_in_intensity,difference_in_width)),distance , Td,difference_in_direction ,Ta,difference_in_height ,Th,difference_in_gradient is 0,difference_in_intensity , Tg,difference_in_width , Tw.

Td, Ta, Th, Tg, and Tw are predefined thresholds.They can be determined based on the knowledge ofimage scale, geometric properties of a road, terraintype, and the computation accuracy of distance anddirection. Relation is a structure which consists of anumber of geometric and radiometric attributesdescribing the difference between two different anti-parallel pairs X and Y. In grouping, the above rule isapplied to all antiparallel pairs and those whichsatisfy the rule are connected to form road-likefeatures. The grouped antiparallel pairs are used asthe input to recognition process. To discriminatethem from the original antiparallel pairs, they arerepresented by the structure feature which is similarto antipair. A feature is expressed as:

feature(feature_no, left_side(S1), right_side(S2),attribute (length, width, gradient,average_intensity)).

S1 and S2 are two point chains which include allpoints belonging to the road-like feature. The at-tributes of the feature are computed from those ofthe antiparallel pairs, which are grouped to form thefeature.

3.3. RecognitionIn computer vision, an object is represented by a

model. The object is interpreted by matching itsfeatures derived by image processing with modelsstored in a database. In knowledge-based objectinterpretation, the geometric and radiometric prop-erties of an object, knowledge of context and topologyof the object and a priori knowledge of the imagesystem are expressed by a number of productionrules, or other forms such as frames. The recognitionof objects is then performed by applying these rulesto the generated features. In our implementation, ahybrid control strategy is used, in which the hypoth-eses of road segments are generated in a bottom-upmode and a top-down procedure is applied to verifythe hypotheses. To generate hypotheses, high-resolution images are used because most details ofthe road surface can be detected in these images.Low-resolution images represent the global struc-ture of the road network. Therefore, they are used togenerate the topology of the road network for verifi-cation of hypotheses.

3.3.1. Generation of hypotheses. In hypothesisgeneration, knowledge about objects to be inter-preted is applied to the structures and the associatedattributes and relationships generated in the mid-level processing. The knowledge is usually expressedas production rules or other forms, such as frames.When the generated structures, relationships andassociated attributes match the rules or definitionsin frames, the objects so defined are hypothesized.For the interpretation of a road, the knowledge aboutits geometric and radiometric properties is applied tothe road-like features derived in the previous steps,described by the structure feature with a number ofgeometric and radiometric attributes. The followingis the rule for the recognition of a road segment:

road(X):-feature(X, _, _, attribute(length, width,gradient, average_intensity),length .Tl,width . Wd 2 Td,width , Wd 1 Td,gradient is 21,average_intensity . G0 2 Ti,average_intensity , G0 1 Ti.

The gradient in the rule is set as 21 because the roadsurface usually has a higher reflectance than itsbackground. Tl is the threshold for the length ofroad-like features which is determined by the win-dow size of image processing. Wd is the road widthwhich can be found in the specifications of roadconstruction manual [30].Td is the threshold for theroad width, which can be determined by its accuracy

218

Page 5: Automatic Road Extraction from Aerial Images

of computation. G0 is the standard average intensityof the road surface which may be obtained by statis-tical methods. Because G0 varies with many factors,such as season, time and weather condition of photog-raphy, condition of photographic processing, slope ofroads, a high threshold Ti is usually given.

3.3.2. Verification of hypotheses. Verification ofhypotheses is a crucial step in object recognition. Theconsistency among the generated hypotheses ischecked and spurious hypotheses are rejected duringthis process. The missing parts of objects are alsoinferred and found. This is usually done using apriori knowledge about the objects to be interpreted,such as context and topology. Context defines thesemantic relationships between the objects. Theusefulness of context for object recognition is ex-plored in [15,22,31]. Topology presents the spatialrelationships between objects and can be used toinfer the occlusions during hypothesis verification[13,32]. In our implementation, the part-whole rela-tionships between roads in high-resolution imagesand roads in low-resolution images are used to verifythe generated hypotheses and to preclude spurioushypotheses, and the spatial relationships betweenverified road segments are utilized to infer themissing road segments.

A road network in low-resolution images consistsof lines and points. Topology defines the connectivityrelationships between these elements. It can bedepicted by a graph, which is composed of edges andvertices and is usually expressed as [33]

G 5 G (V, E ),

V 5 (vi ), with i 5 1, 2, . . . , m, and

E 5 (ej ), with j 5 1, 2, . . . , n,

where G, E, and V stand for the graph, edges, andvertices respectively, and m and n are the numbers ofthe vertices and edges in the graph. Each edge in thegraph is defined by two vertices and two intersectingedges meet at a common vertex. These can bedescribed by the structure edge and relation adjacentin Prolog [34].

After the topology of the road network has beengenerated, the part–whole relationships betweenroads in high-resolution images and roads in low-resolution images can be established. This can bedone by projecting hypothesized road segments inhigh-resolution images onto low-resolution images.A hypothesized road segment is a part of a road inthe low-resolution image if its projection lies on the

road. A part-whole relationship is expressed as [34]

part(h, edge),

where h is a hypothesized road segment in thehigh-resolution image and edge is a road in thelow-resolution image. A hypothesized road segmentis accepted, when it is a part of an edge in the graph,and the ratio of the length of all projected hypoth-esized road segments belonging to the edge to thelength of the edge, is larger than a given value.

After hypothesized road segments are verified,they are related by a spatial relationship which hasthe form [34]

neighbour (h1, h2, d),

where h1 and h2 are two verified road segmentsbelonging to the same road and close to each other,and d is the distance between them. In the ideal case,two road segments should join together; i.e., thedistance is zero. However, they are disconnectedwhen the length of the road-like features betweenthem is shorter than the threshold T1 defined in therule road, or there is disturbance between them. Inthis case, missing road segments between two roadsegments can be hypothesized and a top-down proce-dure is applied to search for them. Segments whichhave no relations with roads in low-resolution im-ages are treated as spurious hypotheses and re-moved.

4. TESTS

In our tests, an aerial image in Hunter Valley, NewSouth Wales is used (Fig. 3a). The image has a scaleof 1:25,000 and is scanned with a pixel size of 30 µm.The area covered by the image is a rural area inwhich there are two highways passing through andintersecting each other. The extraction of roadsstarts with hypothesis generation of road segmentsfrom high-resolution images. Figure 2 shows a por-tion of the test image with 500 3 500 pixels in whichthere is a curved road segment. The Canny operatoris applied first to extract edge information. Theextracted edges are presented in Figure 2a. Afteredge tracking, linking, and noise removal, the antipa-rallel pairs are generated based on the criteria ofgradient direction, spatial direction of edge seg-ments, and the standard deviation of the computeddistance between the segments. A total of 18 antipar-allel pairs are generated, of which 6 correspond tothe road and the others are nonroad objects. Due to

219

Page 6: Automatic Road Extraction from Aerial Images

the existence of trees and low contrast between theroad surface and its background, the generatedantiparallel pairs along the motorway are separatedat the lower left and upper right parts (Fig. 2b).Having applied the rule connect given in Section 3 tothe generated antiparallel pairs, the gaps betweenantiparallel pairs along the road are bridged success-fully (Fig. 2c). In recognition, the rule road is applied

to all derived roadlike features, and one hypothesisof road segment is produced (Figure 2d). The thresh-olds for the length, width, and average intensity ofroad segments are set as 200 pixels, 1 pixel, and 30,respectively, and 150 is chosen for the averageintensity in this example. In the ideal case, thethreshold for the length of road segments should beequal to the size of the test window, when it is

FIG. 2. Hypothesis generation: (a) extracted edges; (b) generated antiparallel pairs; (c) features formed after grouping; (d)generated hypothesis of road segment.

220

Page 7: Automatic Road Extraction from Aerial Images

assumed that a road segment passes through twoopposite sides of the window. However, the segmentmay pass though two neighboring sides of the win-dow or it may be occluded in the window. Therefore, avalue of less than half the window size is chosen inthis example. Antiparallel pairs with a length lessthan the given threshold will not be hypothesized.This will be remedied in the process of hypothesisverification. The intensity value and its threshold

are determined by taking the average and the stan-dard deviation of the gray values of all road seg-ments in the image. To obtain the optimal values forthese parameters, more tests are needed.

To generate the topology of the road network, theoriginal image is resampled by a reduction factor of15. In the resampled image, a road appears as a linewith a width of 1 to 3 pixels (Fig. 3a). The generationof topology consists of line extraction and linking,

FIG. 3. Topology of network: (a) original image; (b) extracted lines; (c) smooth segments after processing; (d) network aftergrouping.

221

Page 8: Automatic Road Extraction from Aerial Images

extraction of intersections, removal of non-road linesegments and grouping of the smooth line segments.For line extraction, a morphological operator is used[35], and the SE operator is used to detect cornerpoints and intersections (Fig. 3b). Roads in low-resolution images are smooth line segments andusually have higher intensity than their back-ground. Therefore, short line segments and seg-ments with large curvatures are treated as nonroadsegments and removed after line tracking and link-ing (Fig. 3c). Since the remaining line segments areseparated, they need to be grouped to form a com-plete network. The grouping of the smooth linesegments is accomplished based on their radiometricattributes (gray value and gradient) together withspatial constraints (distance and direction). Aftergrouping, the isolated short segments are removed.In this example, the generated road network consistsof three lines (road segments) and one vertex (inter-section) (Fig. 3d).

After the topology of the road network is gener-ated, the relationships between roads in high-resolution images and roads in low-resolution im-ages are established. Figure 4a shows the result ofprojecting the hypothesized road segments in thehigh-resolution image onto the low-resolution image.As can be seen, road segments are extracted in mostareas (white lines), but in some test windows nohypothesis is generated (white boxes) or more thanone hypothesis is generated (black box). With the

established part–whole relationships, most hypoth-esized road segments are accepted except the one inthe black box. The missing road segments are pre-dicted using the spatial relationships between veri-fied road segments and are detected in a top-downprocedure. Finally, all road segments are connectedto form a complete road network (Fig. 4b).

5. CONCLUSION

The paper presents a knowledge-based method forautomatic extraction of roads from aerial photo-graphs and high-resolution remotely sensed imag-ery. The method includes low-level processing forfeature extraction, mid-level processing for genera-tion and grouping of generalized antiparallel pairs,and high-level processing for the recognition of roads.The method is based on a multiresolution road modelin which road networks are defined in low- andhigh-resolution images and related with part–wholeand specialization–generalization relationships. Ituses a hybrid control strategy in which the hypoth-eses of road segments are generated in a bottom-upmode and a top-down procedure is used to predict themissing segments. A generalized antiparallel pair isproposed to describe the road structure. In order toachieve reliable results, the geometric and radiomet-ric attributes of the antiparallel pairs are used andformulated into one rule. In hypothesis verification,

FIG. 4. Hypothesis verification: (a) generated hypotheses; (b) extracted road network after verification.

222

Page 9: Automatic Road Extraction from Aerial Images

the hypothesized road segments are verified basedon part–whole relationships and the spatial relation-ships between the verified road segments are used topredict the missing segments and remove the falsehypotheses. The results show that the road networkin the test image is successfully extracted.

REFERENCES

1. Quam, A. Road tracking and anomaly detection in aerialimagery. In Proceedings of the DARPA Image UnderstandingWorkshop, May 1978, pp. 51–55 .

2. Fischler, M. A., Tenenbaum, J. M., and Wolf, H. C. Detectionof roads and linear structures in low-resolution aerial imag-ery using a multisource knowledge integration technique.Comput. Graph. Image Process. 15 (1981), 201–223 .

3. McKeown, D. M., and Delinger, J. L. Cooperative methods forroad tracking in aerial imagery. In IEEE Proceedings onComputers and Computer Recognition, 1988, pp. 662–672.

4. Grun, A., and Li, H. Semi-automatic road extraction bydynamic programming. Int. Archives Photogrammetry Re-mote Sensing, 30 (1994), 324–332.

5. Kass, M., Witkin, A., and Terzopoulos, D. Snakes: Activecontour models. Int. J. Comput. Vision 1 (1988), 321–331.

6. Fua, P., and Leclerc, Y. G. Model driven edge detection. Mach.Vision Appl. 3 (1990), 45–56.

7. Trinder, J. C., and Li, H. Semi-automatic feature extractionby snakes. In Automatic Extraction of Man-Made Objectsfrom Aerial and Space Images (A. Grun, O. Kubler, and P.,Agouris, Eds.). Birkhauser Verlag, Basel, 1995, Vol. I, pp.95–104.

8. Grun, A., and Li, H. Linear feature extraction with LSB-snakes from multiple images. Int. Archives PhotogrammetryRemote Sensing 31 (1996), 266–272.

9. Trinder, J. C., and Wang, Y. Towards automatic featureextraction for mapping and GIS. In Geographical InformationSystems and Remote Sensing Application, Hyderabad, India(I. V. Muralikrishna, Ed.), 1997, pp. 8–12.

10. Bajcsy, R., and Tavakoli, M. Computer recognition of roadsfrom satellite pictures. IEEE Trans. Systems Man. Cybernet.6 (1976), 623–637.

11. Zhu, M. L., and Yeh, P. S. Automatic road network detectionon aerial photographs. In Proceedings of IEEE Conference onComputer Vision and Pattern Recognition, 1986, pp. 34–40.

12. Wang, J. F., and Howarth, P. J. Automatic road networkextraction from Landsat TM imagery. In ASPRS–ACSMAnnual Convention, Baltimore, 1987, Vol. 1, pp. 429–438.

13. Ruskone, R.,Airault, S., and Jamet, O. Road network interpre-tation: A topological hypothesis driven system. Int. ArchivesPhotogrammetry Remote Sensing 30 (1994), 711–717.

14. Cleynenbreugel, J. van, Fierens, F., Suetens, P. and Ooster-linck, A. Delineating road structures on satellite images by aGIS-guided technique. Photogramm. Eng. Remote Sensing 56(1990), 893–898.

15. Gunst, M. de. Knowledge-Based Interpretation of Aerial Im-ages for Updating of Road Maps. Publications on Geodesy, No.44, Netherlands, Geodetic Commission, 1996.

16. Gunst, M. de, and Vosselman, G. A semantic road model foraerial image interpretation. In Semantic Modeling for theAcquisition of Topographic Information from Images and

Maps, (W. Forstner and L. Plumer, Eds). Birkhauser Verlag,1997, pp. 107–122.

17. Marr, D. Vision. Freeman, San Francisco, 1982.18. Forstner, W. A framework for low level feature extraction.

Computer Vision-ECCV 94, Vol. II, pp. 383–394.19. Fuchs, C., and Forstner, W. Polymorphic grouping for image

segmentation. 5th ICCV’ 95, Boston, 1995, pp. 175–182.20. Henricsson, O. Analysis of Image Structures Using Colour

Attributes and Similarity Relations. Ph.D. Dissertation, Eid-genossichen Technischen Hochschule, Zurich, Switerland,1996.

21. Ballard, D. H., and Brown, C. M. Computer Vision. Prentice–Hall, Englewood Cliffs, NJ, 1982.

22. McKeown, D. M., Harvey, W. A., and McDermott, J. Rule-based interpretation of aerial imagery. IEEE Trans. PatternAnal. Mach. Perception 7 (1985), 570–585.

23. Strat, T. M., and Fischler, M. A. Context-based vision: Recog-nizing objects using information from both 2-D and 3-Dimagery. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-131991, 1050–1065.

24. Trinder, J. C., Wang, Y., Sowmya, A., and Palhang, M.Artificial intelligence in 3-D feature extraction. In AutomaticExtraction of Man-Made Objects from Aerial and SpaceImages (II) (A. Grun, E. P. Baltsavias, and O. Henricsson,Eds.). Birkhauser Verlag, Basel, 1997, pp. 257–266.

25. Lemmens, M. J. P. M. A survey of boundary delineationmethods. Int. Archives Photogrammetry Remote Sensing, 31(1996), 435–441.

26. Canny, J. A computational approach to edge detection. IEEETrans. Pattern Anal. Mach. Intelligence, PAMI-8 (1986),679–698.

27. Heitger, F. Feature Detection Using Suppression and Enhance-ment. Technical Report BIWI-TR-160, Institute for Communi-cations Technology, Image Science Laboratory, Eidgenoss-ischen Technischen Hochschule, Switzerland, 1995.

28. Nevatia, R., and Babu, R. Linear feature extraction anddescription. Comput. Graphics Image Process. 13 (1980),257–269.

29. Grimson, W. E. L. Object Recognition by Computer: The Roleof Geometric Constraints. MIT Press, Cambridge, MA, 1990.

30. Road Design Manual. Country Roads Board, Victoria, 1977.31. Baumgartner, A., Eckstein, W., Mayer, H., Heipke, C., and

Ebner, H. Context supported road extraction. In AutomaticExtraction of Man-Made Objects from Aerial and SpaceImages (II) (A. Grun, E. P., Baltsavias, and O. Henricsson,Eds.). Birkhauser Verlag. Basel, 1997, pp. 299–310.

32. Rothwell, C. Reasoning about occlusions during hypothesisverification. In Computer Vision-ECCV 96 (B. Buxton and R.Cipolla, Eds.), Vol. I, pp. 599–609.

33. Chachra, V., Ghare, P. M., and Moore, J. M. Applications ofGraph Theory Algorithms. Elsevier, New York, 1979.

34. Wang, Y. and Trinder, J. C. Use of topology in automatic roadextraction. Int. Archives Photogrammetry Remote Sensing 32(1998), 394–399.

35. Haralick, R. M., and Shapiro, L. G. Computer and RobotVision. Addison–Wesley, Reading, MA, 1990.

JOHN C. TRINDER received his Bachelor’s degree in surveying(honors) from the School of Geomatic Engineering, formerlySurveying, of the University of New South Wales in 1963, and hisM.Sc. and Ph.D. from the International Institute for Aerospace

223

Page 10: Automatic Road Extraction from Aerial Images

Survey and Earth Sciences (ITC), the Netherlands, and theUniversity of New South Wales in 1965 and 1971, respectively. Hejoined the school of Geomatic Engineering of the University ofNew South Wales in 1965 and became professor in 1991. He is afellow of the Institute of Surveyors, Australia. He has beenspecializing in teaching and research in photogrammetry, remotesensing, and geographical information systems (GIS), for whichhe has received a number of awards. His research interestsinclude feature extraction from remotely sensed imagery, imageunderstanding, machine vision, and terrain modelling. He hasbeen Chairman of the New South Wales Chapter of the RemoteSensing and Photogrammetry Association of Australia and hasheld several executive positions in the International Society forPhotogrammetry and Remote Sensing, currently being Secretary

General. He is Head of the School of Geomatic Engineering at theUniversity of New South Wales.

YANDONG WANG received his Bachelor’s degree in photogram-metry and remote sensing and his advanced postgraduate di-ploma from the Wuhan Technical University of Surveying andMapping, China, and the International Institute for AerospaceSurvey and Earth Sciences (ITC), the Netherlands, in 1982 and1986, respectively. He joined the School of Photogrammetry andRemote Sensing of Wuhan Technical University of Surveying andMapping, in 1982 and was a lecturer during the period from 1987to 1995. Currently, he is a Ph.D. candidate in the School ofGeomatic Engineering of the University of New South Wales. Hisresearch interests include feature extraction, image understand-ing, and machine vision.

224