jan husdal thesis msc in gis

8/14/2019 Jan Husdal Thesis MSc in GIS

1/73

How to make a straight line square

Network Analysis in Raster GIS with time-dependent cost variables

Thesis for the MSc in GIS at

University of Leicester

1999-2000

Author: Jan Husdal


2/73

How to make a straight line square - Network analysis in raster GIS with time dependent cost variables

Jan Husdal, MSc in GIS, University of Leicester, 1999-20002

Abstract

Network analysis and least cost paths have long been the dominion of vector GIS. This research

explores the topic of network analysis in raster GIS, using MFworks as example software. Current

algorithms, procedures and network modelling techniques are investigated and common artefactsare explained. An extension of Tomlin's directional identifiers is proposed, allowing the modelling of

non-planar features. Along with this, the integration of time- dependent travel cost variables is

achieved through linking MFworks with an external Visual Basic application for updating the cost-of-

passage surface, demonstrating that such interaction extends the inherent capabilities of a GIS

engine. Another conclusion to be drawn from this paper is that network analysis in raster GIS is a

variant of surface analysis.


3/73



Table of contents

1 INTRODUCTION AND OUTLINE .......................................................................... 6

2 AIMS AND OBJECTIVES ......................................................................................... 7

2.1 VISION .................................................................................................................... 7

2.2 METHODOLOGY ...................................................................................................... 7

2.3 EXPECTED RESULTS ................................................................................................ 8

3 BACKGROUND RESEARCH AND LITERATURE REVIEW ............................ 9

3.1 GRAPH THEORY AND FASTEST PATHS ...................................................................... 9

3.2 FASTEST PATH ALGORITHMS ................................................................................. 10

3.2.1 Static fastest paths ....................................................................................... 10

3.2.2 Dynamic fastest paths .................................................................................. 11

3.3 FASTEST PATHS IN RASTER GIS ............................................................................ 12

3.3.1 Turning a surface into a network ................................................................. 123.3.2 Tomlin .......................................................................................................... 13

3.3.3 Eastman ....................................................................................................... 13

3.3.4 Douglas ........................................................................................................ 14

3.3.5 Xu and Lathrop ............................................................................................ 15

3.3.6 McIlhagga .................................................................................................... 16

3.3.7 Collischonn and Pilar .................................................................................. 18

3.3.8 Berry ............................................................................................................ 19

3.3.9 Network analysis or surface analysis ? ....................................................... 21

4 MFWORKS ................................................................................................................ 23

4.1 ABOUT MFWORKS ................................................................................................ 234.2 NETWORK ANALYSIS IN MFWORKS...................................................................... 23

4.3 MFCOM ................................................................................................................ 25

5 APPLICATION DEVELOPMENT ......................................................................... 26

5.1 DATA ACQUISITION ............................................................................................... 26

5.2 NETWORK MODELLING ......................................................................................... 26

5.2.1 Incremental Linkage inferring linear features .......................................... 27

5.2.2 Directional Identifier constraining directions .......................................... 28

5.3 COST SURFACE MODELLING .................................................................................. 29

5.3.1 General considerations ................................................................................ 29

5.3.2 Determining travel cost and path length ..................................................... 305.4 IMPLEMENTING A DYNAMIC COST SURFACE .......................................................... 33

5.4.1 External application for updating cost surface ........................................... 33

5.4.2 Continuously updated cost surface .............................................................. 35

5.4.3 Time interval cost surface ............................................................................ 36

5.5 RECAPITULATING THE PROCESS ............................................................................ 38

6 DISCUSSION AND EVALUATION OF RESULTS ............................................. 39

6.1 CHOICE OF SOFTWARE .......................................................................................... 39

6.2 CHOICE OF DATA................................................................................................... 40

6.3 GENERAL ACCURACY............................................................................................ 40

6.4 SPREAD ARTEFACTS .............................................................................................. 406.5 CONTINUOUSLY UPDATED COST SURFACE............................................................. 43


4/73



6.6 TIME INTERVAL COST SURFACE(S) ........................................................................ 44

6.7 EXTERNAL VISUAL BASIC APPLICATION ............................................................... 44

6.8 MEETING THE OBJECTIVES .................................................................................... 44

7 FUTURE DEVELOPMENTS .................................................................................. 46

7.1 MODELLING NON-PLANAR NETWORK FEATURES ................................................... 467.2 APPLICATION DEVELOPMENT ................................................................................ 48

8 CONCLUDING REMARKS .................................................................................... 49

9 ACKNOWLEDGEMENTS ...................................................................................... 50

10 REFERENCES ....................................................................................................... 51

11 APPENDICES ........................................................................................................ 54

APPENDIX ANETWORK ANALYSIS IN MFWORKS , STEP BY STEP ...................................... 55

APPENDIX B- NETWORK ANALYSIS IN MFWORKS,TRAVERSING TIME INTERVALS.......... 63

APPENDIX C-USING THE VISUAL BASIC APPLICATION.................................................... 69


5/73



Table of figures

FIGURE 3-1: NETWORK REPRESENTATION IN GIS................................................................... 9

FIGURE 3-2: TRACING A PATH FROM CELL TO CELL IN RASTER GIS..................................... 12

FIGURE 3-3: DOUGLAS' METHOD OF INTERPOLATING A LINEAR PATH .................................. 14

FIGURE 3-4: INFERRING LINEAR FEATURES IN RASTER CELLS,TOMLIN VERSUS DOUGLAS... 14FIGURE 3-5: SAMPLING METHODS FOR ACCURATELY SIMULATING SPREAD PHENOMENA ..... 15

FIGURE 3-6: MEASUREMENTS RELATED TO THE INCREMENTAL TIME OF THE LINK. .............. 16

FIGURE 3-7: EFFORT DISTANCE VERSUS FIXED COST DISTANCE. ........................................ 17

FIGURE 3-8: A PRACTICAL EXAMPLE OF EFFORT DISTANCE VERSUS FIXED COST DISTANCE . 17

FIGURE 3-9: LEAST COST PATH ALGORITHM. ....................................................................... 18

FIGURE 3-10: BEST ROUTE OF A ROAD UP A CONICAL MOUNTAIN........................................ 19

FIGURE 3-11: ADDING THE PROXIMITY SURFACES YIELDS THE OPTIMAL PATH. ................. 20

FIGURE 3-12: OPTIMAL PATH BETWEEN MULTIPLE DESTINATIONS ...................................... 21

FIGURE 4-1: FINDING THE OPTIMAL PATH USING MFWORKS ................................................ 25

FIGURE 5-1: DATA USED FOR EXPLORING NETWORK ANALYSIS IN RASTER GIS ................... 26

FIGURE 5-2: INCREMENTAL LINKAGE .................................................................................. 27FIGURE 5-3: BUILDING A ROAD NETWORK FROM USING INCREMENTAL LINKAGE ............... 28

FIGURE 5-4: TOMLIN'S DIRECTIONAL IDENTIFIERS ............................................................... 28

FIGURE 5-5: INFERRING FLOW DIRECTIONS FROM INCREMENTAL LINKAGE VALUES ........... 29

FIGURE 5-6: TOMLIN'S INCREMENTAL LENGTH.................................................................... 30

FIGURE 5-7: ACTUAL PATH LENGTH AND INCREMENTAL LINKAGE PATH LENTGH ............... 31

FIGURE 5-8: SOUGHT PATH THROUGH A JUNCTION ............................................................... 31

FIGURE 5-9: INFERRED PATH THROUGH A JUNCTION ............................................................ 32

FIGURE 5-10: CORRECTLY MODELLED JUNCTION ................................................................. 32

FIGURE 5-11:EXAMPLE OF CORRECTLY MODELLED PATH THROUGH A JUNCTION ................ 33

FIGURE 5-12: TYPICAL USER INTERFACE FOR VISUAL BASIC APPLICATION ......................... 34

FIGURE 6-1 TYPICAL "MULTIPLE OPTIMAL PATH"-ARTEFACT .............................................. 41

FIGURE 6-2: SOLVING THE MULTIPLE PATH PROBLEM USING EUCLIDEAN DISTANCE ............ 41

FIGURE 6-3: SOLVING THE MULTIPLE PATH PROBLEM USING A COST SURFACE.................... 43

FIGURE 7-1: EXTENDING TOMLIN'S DIRECTIONAL IDENTIFIERS. ........................................... 47

FIGURE 7-2: MODELLING OVERPASSES AND CROSSING OF LANES......................................... 47


6/73



1 Introduction and outline

Imagine you are about to set out on a journey to where youve never been before. Of course, you

take your road atlas and plan how to get there. You look for the fastest way. In doing so, you count

miles, you choose major roads, look for short cuts, avoid urban areas, you may even think of what

time you are going to travel at and how long it might take at that particular time. In practice, you are

trying to find the fastest path through the road network. Computation of fastest paths is one of the

most fundamental problems in route planning and has been subject to extensive research for many

years. The majority of published research on fastest paths algorithms has dealt with static networks

that have fixed topology and fixed costs for traversing the network, providing at best an averaged

travel time.

Contrary to that, this research will deal with dynamic networks, with time-dependent or randomly

varying travel cost, as it de facto is in real-world networks. Furthermore, contrary to conventional

standards, this research will be undertaken using raster GIS.

Raster-based GIS are not commonly known for network analysis capabilities. However, there is one

existing software, MFworks, developed by Thinkspace, which has an extensive set of operators that

enable network type analyses. The fact that MFworks is developed directly upon Tomlins map

algebra operators (Tomlin, 1991) poses an interesting backdrop for this research.

The final inspiration for using raster GIS and MFworks in particular came through a review of raster-

based GIS and their capabilities in the June 2000 issue of GeoEurope (Limp, 2000), featuring

MFworks as the only raster GIS capable of true network analysis. Triggered by this, the author

decided to investigate the validity of this promotional claim.

Thus, the idea for this paper is to explore the topic of network analysis in raster GIS, using MFworks

as example software, to discuss capabilities and limitations of raster GIS, to project possible

improvements and to develop a concept for applying time-dependent travel cost in network analysis.

This research is undertaken in understanding with Thinkspace, and if successful, will provide a

valuable extension to the functionality of MFworks.


7/73



2 Aims and objectives

2.1 Vision

The overall vision for this research is the integration real-world events into route planning, ultimately

allowing real-time traffic information to be used in calculating the optimal path through a continuously

updated dynamic network.

The concrete aim of this research is to develop a method that allows a network analysis tool to find

the fastest path through a dynamic network, simulating time dependent or randomly varying (real-

time) traffic.

From an academic viewpoint, a further aim of this research is to prove that network analysis is a

function that is not restricted to vector GIS only, but is equally feasible in raster GIS, something the

author has already pointed out in his previous research.

2.2 Methodology

Two approaches can be discerned:

A continuously updated network, where the latest available data at the starting time of travel is used,

and where the cost of travel does not change during the duration of the estimated route.

A network with varying travel cost per pre-defined time interval, where the travel cost is dependent

on the starting time of travel, and where the travel cost changes when the estimated route passes

from one time interval to another.

In practice, the first approach involves building an application that passes on values to the cost

surface used in a raster-based GIS for calculating the fastest path in a network. The second

approach involves working with multiple cost surfaces for estimating the fastest path.


8/73



2.3 Expected results

Both mentioned approaches are to be investigated and attempted implemented in MFWorks network

analysis functions, as a paradigm for other raster GIS.

A proof that network analysis is feasible in raster GIS, based on Tomlin's map algebra

operations.

A concept for calculating dynamic paths through a network.

A custom-built application for MFworks for calculating dynamic paths through a network.


9/73



3 Background research and literature review

Eulers famous Knigsberg bridge question, dating back as far as 1736, is often seen as the

starting point of modern path finding was it possible to find a path through the city crossing each of

its seven bridges once and only once and then returning to the origin? Euler's methods formed the

basis of what is known as graph theory, and which in turn paved the way for path finding algorithms.

Traditionally, network analysis, path finding and route planning have been the domain of graph

theory and vector GIS, which is where most algorithms find their application. However, it is not

difficult to adapt these algorithms to a raster environment, as will be highlighted in this chapter.

Raster applications are more likely to be based on movement across a surface than movement along

a network, since the general idea of finding the least cost path is linked to movement from cell to cell,

and not along a finite line. Many researches have thus sought to improve the shortcomings of the

raster approach and have developed various solutions and proposals. In order to appreciate their

efforts, first, a synopsis of the conventions that encircle path-finding algorithms is necessary.

3.1 Graph theory and fastest paths

A network model can be defined as a line graph, which is composed of links representing linear

channels of flow and nodes representing their connections (Lupien et al., 1987). In other words, anetwork takes the form of edges (or arcs) connecting pairs of nodes (or vertices). Nodes can be

junctions and edges can be segments of a road or a pipeline. For a network to function as a real-

world model, an edge will have to be associated with a direction and with a measure of impedance,

determining the resistance or travel cost along the network, as shown in figure 3-1.

Figure 3-1:Network representation in GIS. From: Husdal (1999)


10/73



Traditionally, a GIS represents the real world in either one of two spatial models, vector-based, i.e.

points, lines and polygons, or raster-based, i.e. cells of a continuous grid surface. Since the

modelling of network structures intuitively refers to lines and points, vector GIS has dominated the

realm of network analysis. This research will show that network analysis is equally feasible in raster

GIS.

3.2 Fastest path algorithms

Because path finding is applicable to many kinds of networks, such as roads, utilities, water,

electricity, telecommunications and computer networks alike, the total number of algorithms that

have been developed over the years is immense.

In 1979, Pang and Deo (1979) set up a nomenclature of shortest path algorithms, describing as

many as 222 different algorithms, dating back as far as 1958. Their paper provides an excellent

classification scheme, a brief description of each algorithm and highlighted comparisons of particular

algorithms.

Semantically one can distinguish between path finding in a fixed static network, with set costs for

traversing the network, and path finding in a dynamic network, where the cost of traversing the

network varies over the time of traversing.

One way of dealing with dynamic networks is splitting continuous time into discrete time intervals

with fixed travel costs, as noted by Chabini (1997). Thus, understanding shortest path algorithms in

static networks becomes fundamental to working with dynamic networks.

3.2.1 Static fastest paths

The majority of published research on shortest paths algorithms has dealt with static networks that

have fixed topology and fixed costs. Given the computational restraints in the capacity of past

computer systems this is not surprising. Not more than a decade ago, Van Eck (1990) reports

several hours as an average time for a computer to churn through an all-to-all calculation on a 250-

nodes small-scale static network, and several days on a 16.000-nodes large-scale network.


11/73



Several algorithms and data structures for algorithms have been put forward since the classic

shortest path algorithm by Dijkstra (1959). This algorithm computes a path in all directions from the

origin node and terminates when the destination has been reached.

The original Dijkstra algorithm explores all directions from the starting node, thereby searching an

unnecessary large search area. This led to the development of heuristic searches, among them the

A*algorithm, introduced by Mitchell and Kiersey (1984), that searches in the direction of the

destination node. This avoids considering directions with non- favourable results and reduces

computation time.

Another improvement is seen in the bi-directional search, computing a path from both origin and

destination, and ideally meeting at the middle, which then terminates the search. However, as

Dreyfus (1969) notes, in certain cases, the number of iterations required may actually exceed the

Dijkstra algorithm, or even produce a false result.

The A* algorithm, along with Dijkstra-based algorithms, seems to be preferred in most of the

literature researched by the author. It is in fact noteworthy that the Dijkstra algorithm has prevailed to

the present date, proving its universal validity.

3.2.2 Dynamic fastest paths

A few early attempts on dynamic approaches, referenced by Chabini (1997), are Cooke and Halsey

(1966) and Dreyfus (1969). Current literature suggests that dynamic network fastest path problems

can be reduced to static fastest path problems if continuously varying link travel times are expanded

for a time interval or given an estimated value. This makes the A* and Dijkstra algorithms applicable

in both static and dynamic networks. Adaptations of Dijkstra's algorithm are frequently found in rasterGIS application, thus it seems natural not to delve more deeply into various dynamic path algorithms,

as these are in most cases applied in graph theory, and not related to the research undertaken here.


12/73



3.3 Fastest paths in raster GIS

Even though vector GIS has dominated the network analysis scene, this does not mean that finding

the shortest path is not viable in a raster environment. However, in order to understand network

analysis in raster GIS, the particular properties of raster GIS must be understood first.

3.3.1 Turning a surface into a network

In a raster GIS cartographic space is defined as a surface, where the value of a particular property

varies over this surface. In order to adapt a network structure, each cell may be seen as a node

linked to its eight neighbouring cells. The cell value of each node then can represent the cost of

traversing this particular cell. This cost-of-passage surface is a grid where the values associated with

the cells are used as weights to calculate least cost paths. These weights may represent the

resistance, friction or difficulty in crossing the cell and may be expressed in terms of cost, time

distance or risk (Collischon and Pilar, 1999). Starting from a given destination cell, it is then possible

to spread outward and calculate for each surrounding cell, the accumulated cost of travelling from

any surrounding cell to the destination cell. From this accumulated surface it is then possible to

delineate the shortest or least-cost path to the destination cell from any surrounding cell (Douglas,

1994), simply by following the path with the least accumulating friction.

This network adaptation from a surface has its shortcomings. A straight line indicating the shortest or

least-cost distance from a starting node to a destination node must follow a zigzag line directed by

the grid resolution, and thus only can approximate the correct distance (Fig 3-2).

Figure 3-2:Tracing a path from cell to cell in raster GIS generates a zigzag path instead of a straight line.

There are three main network problem types, related to the ways that weights or resistances are

assigned to each of the network links: isotropic, partially anisotropic, and fully anisotropic.

In an isotropic network, the cost-of-passage is dependent on location only, and the surface consists

on one or multiple homogenous and coherent patches. Resistance is independent of direction, i.e.


13/73



equal in all directions. The second type has direction-dependent cost, but with one prevailing

direction over the whole surface, as is the case in the fire spreading study in Xu and Lathrop (1995).

In a true anisotropic network the cost of passage is direction-independent and varying over the whole

surface, as is the case in most road networks (Collischonn and Pilar, 1999).

3.3.2 Tomlin

The concepts surrounding surface analysis for optimal paths date back to late 1970s (Berry, 2000;

personal communication), and were later championed by Dana Tomlin with his dissertation (Tomlin,

1983), which was later published as a book. Here, Tomlin (1991) classifies his map algebra

operators based on how the computer algorithm obtains data values for processing and identifies

three fundamental classes: local, focal and zonal functions.

He introduces a spread algorithm (focal function) for calculating proximity surfaces, delineating the

shortest possible distance from any location to a destination point. In this method, all cells are

initialised to no value. Passes are then made through the image with each cell checking the cost of

travel from each of its adjacent cells that contain a value. If the incremental effort distance is less

than the difference in cumulative cost or the cell contains no value then the cell requires updating to

a new value. Once a pass is completed without any changes, the cost surface has been generated

(Tomlin, 1986; McIlhagga, 1997).

3.3.3 Eastman

McIlhagga (1997, see below) notes that Tomlins method is efficient for small areas or narrow

passages such as a road network; however it becomes very inefficient when large images are

processed. Ronald Eastman remarks that a 512 by 512 grid could require 700 passes to produce a

cost surface (Eastman, 1987). As part of the IDRISI GIS system, Eastman implemented a second

algorithm to generate cost surfaces called thepushbroom procedure. ...the [procedure operates] by

pushing effects through the image, much like a pushbroom would be used to systematically clean a

room. Effects then ripple through the image, much like water being pushed over a wet floor.

(Eastman, 1987). The cost surface image is initialised with no values except the target cell and a

pass through the image is made from upper right to lower left with each cell updating the cell to the

right and each cell below. In this manner all following cells are updated so that the most recent

change is pushed forward with each successive update. Generally, three passes has been found to

be enough to generate a cost surface (McIlhagga, 1997).


14/73



3.3.4 Douglas

Douglas (1994 and 1999) attempts to overcome the limitations of a zigzagging least-cost path by

interpolating the line of the path over the cell. The path does not follow a straight line from centre to

centre, but is determined by first dividing each cell into four triangular facets and then tracing thepath through the four facets (Fig 3-4).

Figure 3-3:Douglas' method of interpolating a linear path across a raster cell. From: Douglas (1999)

Douglas' procedure is an improvement, as the optimal path is more clearly and accurately delineated

than in a fixed grid with only 8 directions as used by Tomlin. Figure 3-4 shows one possible

comparison between Tomlin's and Douglas' concepts.

Figure 3-4:Inferring linear features from raster cells, (left) neighbouring cells, (middle) Tomlin's Incremental Linkage,

(right) Douglas' interpolation, showing one possible solution

Douglas' method of tracing the least-cost path demands a partially anisotropic surface, and assumes

that any least cost-of-passage is prevailing in the direction of the steepest slope. In a road network

this is most often, or maybe always, not the case, as shown by Collischonn and Pilar (2000).

In personal communication, Douglas remarks that the reason for this is his view of cell values as

samples of a location surface. Thus, any link between cells will not necessarily be a straight line

between cell centres, but must be found by interpolation, since the cell centre itself is a sample

value.


15/73



3.3.5 Xu and Lathrop

An attempt to overcome zigzagging outlines of spread zones was undertaken by Xu and Lathrop

(1995). Using the 8 adjacent cells as directions for outward spread leads to an overestimation of

travel time and an underestimation of outward spread distance, because the conventional 8-cell linkapproach inadequately simulates a spread shape, since too few links out of all possible links are

evaluated for connecting properties. Thus, their algorithm investigates the efficacy of adding

additional cells beyond the conventional 8 neighbouring cells as links in the algorithm. These cells

are not adjacent to the spread cell, but can be regarded as connected to it through the cells that are

in-between. Figure 3-5 displays some possible sampling methods.

Figure 3-5:Sampling methods for accurately simulating spread phenomena. From: Xu and Lathrop (1995)


16/73



Figure 3-6:Measurements related to the incremental time of the link. From: Xu and Lathrop (1995)

Figure 3-6 illustrates how Xu and Lathrop calculate the path from the spread cell to any given link

cell. The accumulated cost-of-passage then from cell X to a non-adjacent cell Y is equal to the sum

of the incremental costs for each leg X i to Y i along the link from X to Y, calculated in direct

Euclidean distance from X to Y, see fig 3-6.

3.3.6 McIlhagga

McIlhagga (1997) introduces a new term: fixed-cost distance, as opposed to the usually applied

effort-distance; costs associated with effort distance are incurred every time a movement over a path

occurs or more generally, movement between cells. Fixed cost distance is the cost associated with

creating a path for linking multiple cells to a destination, and is incurred only once, when the path is

created. Thus, minimising effort distance involves minimising the cost from a given point to any

target (the accumulated cost surface for this given point); minimising fixed cost distance requires

finding the optimal path that connects a point to all targets regardless of minimising the path to any of

these targets in isolation (McIlhagga, 1997). Figure 3-7 gives a simple illustration McIlhagga's fixed

cost distance.


17/73



Figure 3-7:Effort Distance versus Fixed Cost Distance. From: McIlhagga (1997)

Later developed into software called "Pathways", the core algorithm in McIlhagga's research is an

extension of Ron Eastman's Push Algorithm applied for multiple targets. In general, the algorithm

"builds" optimal cost surfaces for combinations of targets until all targets have been included. The

key issue for the "Pathways" algorithm is to optimise and predict which combinations will provide the

greatest contributions to a final solution. So for instance, if it is possible to predict that the

combination of target X and target Y with target A & target B provides very useful values this can

mean that all the other sub-combinations to get to this point do not need to be calculated and can be

discarded. A finely tuned prediction engine can result in huge optimisations (McIlhagga, 2000;

personal communication). Figure 3-8 shows a practical example of finding the optimal path from a

forest road to logging sites.

Figure 3-8:Effort distance finds the shortest paths from target to nearest road (left), while fixed cost distance finds the

optimal path linking all targets, if possible, to one path (right). From: McIlhagga (2000).

McIlhagga disagrees with Tomlin and Berry on their method of tracing the optimal path where the

least accumulated friction is, or as Douglas put it, the steepest downhill slope. McIlhagga claims that

the most optimal path algorithm is to select he adjacent cell from a start point where the incremental

cost between them is equal to the difference in cumulative cost to the target delineates an optimal

path (McIlhagga, 1997), as shown in the following example:


18/73



Figure 3-9:Least cost path algorithm. Target is cell at row/column 2/2. Left: cost surface, middle and right: accumulatedcost surface from target cell. Middle: Incorrect path following largest descent from cell at 3/5. Right: Correct

path following accumulated cost. Adapted from McIlhagga (1997)

Given the cell in row/column 2/2 as target, if Following Tomlin, the shortest path from the cell at

row/column 3/5 to the target will first go to the cell at 2/4, since this path has the steepest fall in

accumulated cost. The correct path is via the cell at 3/4. The reason is clear: The shown value for

the difference in accumulated cost from cell 3/5 to cell 2/4, without considering how the values were

inferred, is equal to 2.6 for the incorrect path. However, the calculated value from cell 3/5 to cell 2/4

is (0.7x2 + 0.7x2) equal to 2.8, indicating a discrepancy. For the correct path, the values are 2.0 and

2.0 for the shown and calculated values respectively, thus making the right path in Figure 3-9 the

correct delineation of the least cost path.

3.3.7 Collischonn and Pilar

Collischonn's and Pilars concept of least-cost-paths is particularly worth mentioning, because it not

merely traces a path down a cost surface like so many others, and presents the path following the

least friction as the least cost path, but it links the cost of traversing the slope to the degree and

direction of the slope itself. Thus, forcing a path down a steep slope may in fact be more costly than

a descent that circumvents the steepest directions. This view coincides with conventional planning

procedures for roads and canals, where the topography adjacent to the path plays a major role in

determining the most viable route for the least cost path.

Their algorithm uses a cost-slope function to assign accumulated cost to cells in 3X3 window around

a centre cell. The steeper the slope is, either uphill or downhill, the higher the cost will be, thus

favouring directions with no or little difference in slope. As a result of this procedure, the least-cost

path up or down a hill is not the straight line following the steepest path, but a path that winds or

climbs the hill sideways (Fig 3-10).


19/73



Figure 3-10:Best route of a road up a conical mountain. From: Collischonn and Pilar (2000)

It may look like an artefact inherent in the algorithm, but nevertheless, their procedure produces

hairpin-like bends in the path, much in the same manner that an ordinary road on a hillside might

have.

3.3.8 Berry

Berry (2000a) uses a concept similar to Tomlin; he calls his algorithm "Splash" rather than "Spread".

Berry's map operators differ from Tomlin, although they are used for the same purposes. Berry's

classification scheme is based on the user's perspective of map input and output contents; what the

map(s) look like going in and coming out: reclassify, overlay, distance and neighbourhood. The

Splash algorithm belongs to the distance division.

According to Berry, each grid space on a friction map or cost-of-passage map is coded with the

relative cost of traversing that location. Increased impedance is translated into the steeper slopes,

so that a slope map of an accumulation surface unmasks the relative ease of optimal travel through

each grid space. The notion of optimal movement embedded in an accumulation surface is

important to understand the algorithm. The splash algorithm used to build the surface considers

movement from the eight surrounding cells to each location. The accumulated distance and the

relative impedance for each of the eight potential steps is evaluated. The least costly step, in terms

of total movement, is assigned. Therefore an aspect map of an accumulation surface unmasks the

direction of optimal movement through each grid space. Thus, the optimal path from any location to

the origin is identified as the steepest down hill route over the surface. To find the optimal path

between two points, Berry suggests the creation of an accumulation surface for both points, and

adding them together, see figure 3-11.


20/73



Figure 3-11:Adding the proximity surfaces generated by spreading from 2 points yields the optimal path.

From: Berry (2000a),

The optimal path between the two locations (identified by the line in both the 2-D and 3-D views in

figure 3-11) contains the set of locations having the lowest values (a valley connecting the origins).

The saw-toothed appearance of the optimal path is an artefact of arithmetic rounding, the nature of

the splash algorithm and minimal friction outside the barrier in the centre. Values above the valley

floor indicate the length of the best, but sub-optimal paths forced through any location.

In addition to finding the optimal path, the values on the summation surface identify the length of the

best path, where the term length indicates whatever is used as measure for the cost-of-passage,

be it distance, time, fuel consumption or other units of measure.

As an interesting side note, the difference between the lowest value on the summation surface and

the value at any other location identifies the opportunity cost of forcing a route through that location

(Berry, 2000a).

Berry also develops a concept for finding the path between multiple targets, using what he calls a

stepped accumulation surface. First the optimal path is calculated from point one to point two, then

from point two to point three, and further repeating the procedure for any number of points along the

route (Figure 3-12). However, this procedure does not calculate the best order of points along the

route; points are visited in the order they appear.


21/73



Figure 3-12:Optimal path between multiple destinations using a stepped surface. From: Berry (2000a)

According to Berry (2000a) the stepped accumulation surface can be used as a directed surface,

calculating the optimal path with the points in the given (directed) order or as an undirected surface,with the points in the best (undirected) order. In practice, this involves calculating all possible

combinations of the stepped surface and evaluating the best combination of these. For the research

presented in this paper, this concept can be applied in calculating the optimal path through

successive time intervals, since it is the best-order combination of the various paths through the time

intervals that will determine the least-cost path.

An interesting application of the stepped accumulation surface is found in Berry (2000b), where in-

store shopping patterns were analysed, proving that spatial analysis is not necessarily restricted to

outdoor phenomena.

3.3.9 Network analysis or surface analysis ?

To sum up, much of the literature on fastest or least cost paths in raster GIS described above have

focused on surfaces rather than networks, and have extensively linked slope and aspect to the

propagation of the path. Consequently, the algorithms and findings presented may seem inapplicable

to road networks, as demonstrated by Collischonn and Pilar (2000).

There are also variations in approaches and methodology. Xu and Lathrop attempt to smoothen

Tomlin's octagonal isolines around a spread centre, Douglas is concerned with making the path

follow an interpolated line rather than a zigzag path directed by the grid structure. McIlhagga focuses

on multiple targets and the connecting path between them. Collishonn and Pilar seek to detach the

least-cost path from the traditional path-following-steepest-slope concept and Berry maintains his

view of surfaces as continuous space.


22/73



The transition from surface to network is however not difficult to follow: The difference between a

surface and a network is that in a network, non-linked or non-networked cells constitute a surface

with an infinitely large friction or cost-of-passage, posing an impermeable barrier. Spreading can only

take place along the defined network. The procedures and algorithms used in networks are basically

the same as one would use on continuous surfaces: Finding the optimal path between 2 points is

done is done by adding the proximity surfaces for these 2 points. For multiple points the calculation

becomes more complex, but follows the same principles.

The generic structure of raster GIS facilitates an approximation and distortion of smoothly curved

real-world network features. Nevertheless, the same structure also visualises and draws attention to

the spatial context that the least-cost path is set in.


23/73



4 MFworks

Even though this research was undertaken using a particular software, namely MFworks, the

process of network analysis would be similar in any software that implements Tomlin's network-

related map algebra operators. The following is meant to give a brief description of the software used

and the process involved, in order to familiarise the reader with the terms and figures that appear in

later chapters.

4.1 About MFworks

MFworks is a pixel-based GIS, developed by Thinkspace Inc., which can process images and maps

both visually and quantitatively. C. Dana Tomlin developed the precursor to MFworks, the Map

Analysis Package, and MFworks makes extensive use of the map algebra developed by Tomlin,

something that is unmistakably recognisable in the menu and scripting commands.

Interestingly enough, a major North American power utility, Hydro Quebec, uses MFworks

exclusively for their least cost/shortest distance analysis, thus proving it's feasibility (Thinkspace,

2000; personal communication).

4.2 Network Analysis in MFworks

Network analysis in MFworks follows the pattern outlined in the previous chapter; see Tomlin (3.3.2)

and Berry (3.3.8). Spreading through a cost-of-passage surface from a given point creates an

accumulated-cost-of-passage surface. Two surfaces are created, one for spreading from departure

and arrival point respectively, taking into consideration any directional constraints imposed by the

network (see 5.2); these surfaces are then added to each other, resulting in a accumulated cost

surface, where the location of the cells with lowest values indicate the demarcation of the least-cost

path and the cell values state the path's effective cost, as described by Berry (2000a).

In the following the MFworks internal language term "map layer" will be used interchangeably along

with "surface", where map layer relates to a specific object created in and used by MFworks, and

surface relates to the generic objects described in the previous chapter.


24/73



The procedure of finding the least-cost path can be divided into two parts. The first step is the

creation of the necessary map layers:

Create Cost LayerCost

Create Directional Constraint LayerDirections

Create Departure LayerStart

Create Arrival LayerEnd

The second part is the process of path finding, here described in terms of the MFworks scripting

language, to illustrate the close resemblance to the terms used in GIS and Cartographic Modelling

by Tomlin (1991).

SpreadFromStart= Spread Start

In Cost

OutofDirections

SpreadFromEnd= Spread End

In Cost

OutofDirections

ShortestPath = SpreadFromStart+ SpreadFromEnd,

Isolate lowest values

Graphically, this will result in the following screen images, if using a fictitious inner city road network

as an example (Figure 4-1):


25/73



Figure 4-1:Finding the optimal path using MFworks, from top left to bottom right: cost surface, starting point, ending point,

directional constraints, resulting path after adding proximity surfaces and isolating lowest values

Even though this seems simple and straightforward, caution must be used, as improper or

oversimplified modelling of the cost factors leads to multiple shortest paths, as discussed in 6.3.

Another reason for applying meticulous care in modelling is that networks in raster GIS are a mere

approximation of real world networks, thus being predominantly prone to modelling artefacts. A

description of and solution to some particular problems will be given in the following chapters.

A detailed step-by-step description of performing the path-finding process described in this paper is

given in appendix A and B.

4.3 MFcom

A noteworthy extension to MFworks is MFcom. Using an object-oriented approach, the components

of MFworks have been translated into MFcom modules that can be put together into task-specific

applications and tailored-user interfaces using Visual Basic or C++ programming language.

The reason for mentioning this is because this research was initially aimed at designing an MFcom

network analysis application.


26/73



5 Application Development

The development of a raster GIS application for network analysis comprises iterative steps that lead

to a functioning network. Particular care must be shown in modelling the network structure to comply

with the network-related features of Tomlin's (1991) map algebra operations. This will convert map

layers with square cells into linear elements that are linked together as lines with directional flows

assigned to each cell, and map layers containing cost variables. Applying time-dependent cost of

travel involves building an external Visual Basic application that will interact with the map layer

containing the travel cost data, for updating the values used in the least-cost path calculation.

5.1 Data acquisition

The data for this research was acquired from two sources:

By recreating the Brown's Pond Study Area from Geographic Information Systems and Cartographic

Modelling by Tomlin (1991) and by using simulated data to demonstrate and evaluate the concepts

presented in this paper. The first was chosen for the reason of familiarity, since MFworks is based on

Tomlin's map algebra, and thus the author felt obliged to use the same study area in the exploration

of this paper's topic as was used by Tomlin in his original study. The latter is a simple lattice, where it

is easy to apply various settings as to direction and travel costs, and to compare the results.

Figure 5-1:Data used for exploring network analysis in raster GIS: (left) a recreation from Tomlin's GIS and Cartographic

Modelling, (middle) roads layer isolated from left layer, (right) fictitious street network

5.2 Network modelling

The key to producing a successful network model is in understanding the relationship between the

characteristics of physical network systems and the representation of those characteristics by the


27/73



elements of the network model. The efficacy and validity of the network depends on how precisely

the network can be modelled to match the real world network it represents (Husdal, 1999)

As mentioned earlier, the graph network, which can be explicitly modelled in vector GIS, can only be

approximated by the raster GIS cell structure. Using Tomlin's Incremental Length, Incremental

Linkage, and Directional Identifiers, which identify underlying linear features, it is possible to model a

road network in raster GIS in much the same manner as in vector GIS. The first step is to extract

linear features from neighbouring raster cells; the second step is to assign directions.

5.2.1 Incremental Linkage inferring linear features

This operation, as described by Tomlin (1991), infers the lineal characteristic of raster cells, by

equating consecutive locations with a set of straight lines between them (Figure 5-2). Based on its

relations with neighbouring cells that have the same attribute value, each cell is given a linkage value

indicating how it is linked to other cells.

Figure 5-2:Incremental Linkage, cell value infers the linear structure it represents. From: Thinkspace (2000)


28/73



By assigning a value to each cell equivalent to the linear feature it represents it is possible to create

a network similar to a road network (Figure 3-5). The smaller the cell resolution, the better the real-

world road network will be approximated by this procedure.

Figure 5-3:Building a road network from figure 5-2 using Incremental Linkage with cell values representing linear features

5.2.2 Directional Identifier constraining directions

The second step to creating a road network in raster GIS is to impose constraints on the flow that

can take place from cell to cell. Tomlin was among the first to introduce the idea of inferring the flow

to or from a cell and its eight neighbours. The value assigned to the centre cell in a 3x3 window

would then indicate the directions the flow can take in and or out of this cell. Initially used in

conjunction with inferring the direction of steepest slope over a surface, and thus drainage, in

networks it can be utilised for explicitly allowing or prohibiting flow in certain directions. Figure 5-4

shows how a cell value of 10 is inferred from flow in direction 8 and 2.

Figure 5-4:Tomlin's directional identifiers, cell values indicate possible flow direction in or out of cell

The directional identifiers that are to be assigned to any given cell in a road network can be directly

inferred from the Incremental Linkage values, i.e. Incremental Linkage value 28 yields directional

constraints value 10, and so on. The transition from Incremental Linkage to Directions is done

through a straightforward reassigning of the cell values in the Incremental Linkage map layer to

corresponding values in the Directions map layer. More specific constraints, like one-way directions


29/73



or dead-end roads, which are not directly inferable from the mentioned linkage values, will have to be

assigned manually.

Figure 5-5:Inferring flow directions from Incremental Linkage values in figure 5-3

Although this procedure has now created a fully functional network, one of the limitations of network

modelling in raster GIS comes, literally speaking, to the surface. A raster can only represent a

surface, whereas a road network only in very few cases can be viewed as a planar surface.

5.3 Cost surface modelling

5.3.1 General considerations

Usually, to generate a cost-of passage surface, several variables will to be collapsed into one layer.

These variables might be road class, average speed, traffic density, and congestion during specifictime of day or other factors that contribute to the overall cost variable. The cost-of-passage surface

can be defined by a variety of measurement units: time, fuel consumption, money or other possible

cost units, for which the least cost passage is to be determined.

In a study in Portugal to estimate tourism potential for a certain area, da Costa (1996) used a 100m

grid size, where the cost-of-passage was inferred by deriving the time for traversing each cell from

the average speed at that particular location. Using average speed and time as a means of inferring

cost-of-passage is among the most common approach in network analysis, since it is easy to use

and calculate. However, as pointed out earlier, "least cost" does not always need to be "least time"; it

may just as well be least fuel, least length, or any least cost variable that can be implemented in a

cost-of-passage surface.


30/73



5.3.2 Determining travel cost and path length

To determine the actual length of a path through a number of cells the Incremental Length operation

is used. Incremental Length works similar to Incremental Linkage to the extent that Incremental

Linkage is used implicitly to determine the linkage, from which the length is inferred. IncrementalLength then applies the factor by which the cell resolution has to be multiplied to yield the length of

the linear features in any cell.

Figure 5-6:Tomlin's Incremental Length, cell values indicate factor for calculating length of linear features.

From Thinkspace (2000)

It should be noted that Incremental Length calculates the factor for inferring the total length of alllinear features in any cell. When uncritically applied to deriving the time of traversing a cell, this

function may not yield exact results, since the time of passing straight through a cell may differ from

the calculated value, which takes into account all linear features in that cell, even those that are not

traversed. The smaller the cell resolution and the higher the average speed, the more negligible this

error becomes.


31/73



Grid size

(m)

Incremental

Length (m)

Path Length (m) Difference at

60 km/h (secs)

Difference at

30 km/h (secs)

10 126 94 0.64 1.28

20 252 188 1.28 2.56

50 630 470 3.2 6.4

100 1260 940 6.4 12.8

Figure 5-7:Assessing discrepancy between actual path length and length inferred from Incremental Linkage

Figure 5-7 demonstrates an example of this "error". The bold line is the path sought; the shaded cells

indicate where the Incremental Length factor deviates from the actual length. The table shows how

cell resolution and average speed influence the magnitude of the error.

To determine the length of a particular road stretch, this stretch first has to be separated, then made

subject to the Incremental Length operator to find the correct length. Whether this last procedure is

absolutely necessary will depend on the cell resolution and whether deriving the actual path length is

strictly required as a result of the task in question.

A special case, illustrating the discrepancy between inferred and actual path length, as described

previously, appears at crossroads and junctions and deserves particular attention, especially at non-

90-degrees junctions, as shown in the following example:

Figure 5-8: Sought path through a junction

The actual path sought is a straight line through the junction, indicated by the shaded cells. The bold

lines indicate the actual linear features. The actual path length is cell resolution x 5.6 (1.4 x 4).


32/73



Figure 5-9:Inferred path, as result of Incremental Length/ Incremental Linkage

In order to justify the inferable directions, the Incremental Length operator will link to many cells

together, as shown by the shaded cells and bold line. The inferred path is equal to cell resolution x

8.2 (1.7 x 2 + 1.4 x 2 + 1 x 2), thus overestimating the actual path over this section by 2.6 x cell

resolution. Unless there is substantial difference in cost-of-passage between the two roads that

cross each other, which then prohibits the off-path cells from being included in the added proximity

surfaces, this error will occur in most cases with such junctions.

Figure 5-10:Correctly modelled junction, linkage indicated by bold lines, direction indicated by arrows

A solution to avoid this from the beginning, before finding the least-cost path, would be to model the

linkage correctly, and allow the direction to flow differently. The fact that it is possible to completely

detach the direction of the flow in a network from the underlying linear feature is indeed an

interesting observation concerning the modelling of networks in raster GIS.

The above can also be demonstrated by an example from the study area. Note the linked square in

the left image, as opposed to the straight line in the right image:


33/73



Figure 5-11:(a) Least-cost path using generically inferred directions, (b) Location of a and c in study area, (c) Least-cost path

using manually inferred directions

A detailed step-by-step example of finding the least-cost path using MFworks is given in appendix A.

5.4 Implementing a dynamic cost surfaceFor implementing an dynamic cost surface, these two approaches can be identified: One option is to

use a continuously updated network, where the latest available data at the starting time of travel is

used, and where the cost of travel does not change during the duration of the estimated route.

Another option is to establish a network with varying travel cost per pre-defined time interval. In this

case the travel cost is dependent on the starting time of travel, and the travel cost changes when the

estimated route passes from one time interval to another.

In practice, the first approach involves building an application that passes on values to the cost

surface used in a raster-based GIS for calculating the fastest path in a network. The second

approach involves working with multiple cost surfaces for estimating the fastest path. Both methods

will be considered in the course of this research.

5.4.1 External application for updating cost surface

MFworks offers the possibility to export map layers as a text file in tab-delimited format. Thus, it is

possible to export the cost surface from MFworks as a tab-delimited text file, which is then edited in

Visual Basic or other programming languages, such as Java, and then re-imported into MFworks, for

use in calculating the least-cost path.

The tab-delimited format was chosen, since it is a common format, particularly for exchanging data

to and from databases, where the key issue is to be independent of any proprietary data format

inherent in the database used (i.e. MS Access, Oracle or FoxPro). Manipulating tab-delimited files inVisual Basic then is a rather trivial task. A further reason for choosing Visual Basic was with the


34/73



intent to integrate this application into an MFcom environment. Finally, and most importantly, Visual

Basic allows for rapid application development, minimising the time necessary to create external

applications for interacting with other applications, particularly database applications.

The Visual Basic application is a standalone application that can run independently of MFworks. The

application interface offers manual updates or continuous updates, where the user selects the time

interval between updates. In calculating the accumulated cost of passage, MFworks will then

automatically use the latest updated cost surface. To simulate real-time information the cost surface

is updated with random values.

Figure 5-12:Typical user interface for Visual Basic application for external updating of cost surface

In practical terms, the application reads the tab-delimited xyz-file (with a columns for x-coordinate, y-

coordinate and z- (attribute)-value respectively), and exchanges the value in the third column (z-

value) for a random value.

A detailed explanation of how to use the Visual Basic application is given in appendix C.

As a digression, it is worth mentioning that in the process of building this application, minor problems

in the import function of MFworks were discovered, errors that did not flaw the results, but caused

some annoyance and required additional user-interaction that could have been saved. The errors

were duly reported to and acknowledged by Thinkspace's technical support.


35/73



5.4.2 Continuously updated cost surface

To simulate real-time information the cost surface is updated with random values via a standalone

Visual Basic application. The cost surface is first exported from MFworks as a tab-delimited text file,

which is then edited in Visual Basic, and re-imported into MFworks.

There are 2 possible approaches to implementing time-dependent travel: In the conventional and

typical raster oriented approach, each variable that contributes to the cost surface is contained in a

separate layer, where the final cost-of-passage variable for each location is calculated as a result of

algebraic operations on the various layers at each location. In a different approach, a layer is

constructed, where each cell is given a unique identifier, this identifier is used as key to a database

that holds the final cost-of-passage variable for each location. In this case the calculation of the final

variable takes place within the database.

Whenever the least-cost path is calculated in MFworks, the latest updated cost values are used,

whether they are derived from multiple layers or a database. It should be noted that this calculates

the least cost path as it can be discerned for the time of starting the travel. In both of the approaches

cited it is assumed that the cost-of-passage for each cell remains unaltered during the entire

passage, regardless of the time it takes to traverse the path.

Using multiple cost variables, the surfaces for each variable can be updated independently of each

other, before they are collapsed into one value by map algebra operations. This method would imply

external, not necessarily Visual Basic, applications for each of those variables, and separate

computation procedures.

Another option is to assemble an external database, which holds all variables, and then transfers

one calculated value only to a collapsed cost-of-passage surface. Given the inherent database

connectivity possibilities in Visual Basic, this option is preferable to multiple cost surfaces, albeit this

ultimately is dependent on end-user preferences.


36/73



5.4.3 Time interval cost surface

This approach uses an average cost-of-passage surface for each time interval. If one-hour time

intervals were used, one would need 24 cost surfaces to cover a 24-hour period.

Using multiple cost surfaces for traversing multiple time intervals is analogous to the stepped cost

surface described by Berry (2000a). Applied within the context of time intervals, the first proximity

surface is cut off when the time interval border is traversed. Then, from all endpoints, new proximity

surfaces and shortest paths are calculated, until the destination is reached. Computationally this can

turn into a considerable task, depending on the number of time intervals traversed and the number of

paths generated in each time interval: If time interval A yields 4 possible paths from a starting point,

then for time interval B, the four paths must be continued until the destination or a new time interval

is reached. When the destination has been reached the values of all paths must be added and

compared. In practice, one must calculate proximity surfaces from a starting point to cut-off points

where a new time interval begins, and from cut-off points to starting point, adding the surfaces yields

the accumulated cost-of-passage (time) value for each path. Then, from each cut-off point the

procedure has to be repeated until destination or a new time interval is reached. Finally the

accumulated cost-of-passage value for each successively linked path must be calculated and

compared to find the path with the lowest accumulated value.

In detail, the steps would be as follows:

1. Input

Network, with 3 possible ways from starting point to destination point.

Step 1

Spread from starting point, cut-off when accumulated cell values indicate that a time interval

borderline is being passed. Then, spread backwards from the cut-off points to starting point. Adding

the proximity surfaces yields the cost-of-passage for this time interval for the various paths.


37/73



Step 2

For all cut-off points, spread until next time interval or (in this case) until the destination has been

reached. Spread backwards from destination and add proximity surfaces to yield the cost-of-passage

values for all possible paths. Retain the paths from the cut-off points to destination that have the

lowest cost-of-passage.

Discard other paths

Step 4

Add the cost-of passage for the partial paths to yield the final cost-of-passage value for theremaining paths. Retain the path with the lowest value.

A detailed case study of this approach is given in appendix B.


38/73



5.5 Recapitulating the process

Enabling network analysis in raster GIS means a) employing Tomlin's Incremental Linkage to infer

linear features, and b) using directional constraints to explicitly divert the flow in the network.

Directions do not need to follow the underlying linear features. Directions are needed to delineate the

course of the path(s); linear features are needed to calculate the path length, using Tomlin's

Incremental Length.

Implementing dynamic cost variables can be done by a) updating either one cost surface or multiple

surfaces representing various factors that contribute to the cost-of passage, or b) by using time-

dependent cost surfaces, depending on the time interval the travel takes place in. Time intervals and

paths through these can be calculated consecutively. The more time intervals to be traversed, the

more calculations are to be processed.

Updating dynamic cost surfaces is done via an external application, in this case using Visual Basic,

and demands that the GIS software used can interact with external applications by exporting the cost

surface to the application in a format that the application can process.


39/73



6 Discussion and evaluation of resultsEven when meticulous care has been taken in modelling the network to match the concepts

described previously, it may still not produce the desired results when tested thoroughly. There may

be many reasons for this. Firstly, it may be associated with the software and data involved, it may

also be a result of the inevitable approximation when a finite network is to be modelled in a surface-

favouring environment. One particular problem is the so-called multiple-path-problem, to which a

solution is given below. Finally, it is also time to reflect upon the objectives and whether they were

achieved as set out or not.

6.1 Choice of softwareThe most striking reason for choosing MFworks was the fact that MFworks is the only software

implementing Tomlin's Incremental Linkage and Directional constraints, which are a precondition to

conducting network analysis.Taking into account the underlying modelling assumptions, MFworks

lived up to the expectations the author had prior to his research.

It deserves mentioning that similar grid-based analysis operations to the operations in MFworks,

apart from Incremental Linkage and Directional constraints, exist in Idrisi, ARC/INFO GRID, ILWIS,

the various MAP (Map Analysis Package) editions (MAP, aMAP, pMAP) and in MapCalc, to name

the probably most renowned competing software packages. However, there are significant

programming structure and command syntax differences. Some use a "natural language" command

language of simple phrases forming fully structured, short sentences. Others use a computer

programming structure, similar to C++, where each command is written as an "assignment-operator"

equation with the command and processing specifications forming an order dependent,

alphanumeric string.

Since MFworks was uses an easy to understand natural language, close to Tomlin's map algebra

operators, this constitutes the second reason for choosing MFworks. Whether the ordinary end-user

shares this opinion with the author will depend on his or her familiarity with Tomlin's map algebra

operators. With respect to these operators, MFworks offers the option to write scripts, thus assigning

operations in the order of the user's preference, allowing a versatile set of operations to interact with

each other, something that proved especially useful in calculating paths through time intervals and

solving the multiple path problem (see 6.3).


40/73



6.2 Choice of data

The data chosen for this undertaking was a small dataset, in case of the Brown's Pond area only

covering 13 km2, and in the case of the fictitious data merely 0.25 km2. In order to evaluate the

networking capacities properly, in hindsight it seems more appropriate to have chosen a larger area,

and also to have incorporated real-world data.

The same applies to implementing travel cost. Here, purely fictitious data were randomly located.

However, since the main task was to prove the feasibility of network analysis in GIS, the data used

posed a satisfactory challenge, as it highlighted the necessary procedures for modelling the network

and for solving the problems that occurred.

6.3 General accuracy

Using raster GIS for network analysis leads into a simplification of a complex network structure. The

path is prone to be distorted, firstly, due to the mistaken length introduced by Incremental Length,

and secondly, due the zigzagged path, a consequence of the innate grid structure. Nonetheless, a

fine-tuned use of Incremental Linkage and a minimal cell resolution can have a smoothing effect on

the exact delineation of the path. On the other hand, minimising cell resolution increases

computation.

In raster GIS the precision of the model is determined by the cell resolution, the finer the resolution

(the smaller the cell size), the better the precision. For this research a cell resolution of 20m was

deemed appropriate for the task in question, as it would allow encompassing normal road width and

adjacent areas within one cell width. For a dense inner city road network a cell resolution of 10m

would be preferable, allowing even narrow blocks to be incorporated into the model.

6.4 Spread artefacts

The concept of adding proximity surfaces will some times lead to peculiar artefacts. A special case

appears in a regular lattice if the cost surface is isotropic: Here the multiple path problem is clearly

visible, since all possible paths per se have the same accumulated cost towards the destination.

Here, human mind, as opposed to computer mind, would intuitively seek out a solution that

implements a heuristic search, always following the path that yields the shortest Euclidian distance to

the destination.

More often occurs the so-called multiple path problem, particularly if the road network and the cost

surface are oversimplified. There are no distinctive low values, so that in order to gain a coherent


41/73



route from start to destination so many low valued cells have to be selected that no clear path can be

discerned (Figure 6-1).

Figure 6-1 a(left) and b(right):Typical "multiple optimal path"-artefact caused by oversimplification of network model and cost surface,

(a) is a special case: all cells have the same cost value (homogeneous cost surface), thus there is nodistinguishable lest cost path.

A solution to the first problem is to pass an algorithm over the multiple path image, which will retain

the cells that are successively closest in Euclidian distance to the destination point, similar to the

renown A*-algorithm.

In practice, this means first, creating a layer with cell values equivalent to the Euclidian distance from

cell to destination, and second, tracing the least cost path down this surface. Spreading from the

destination point in a homogeneous surface with cost value 1, and then masking out the cells that

are roads, achieves the first step of creating a Euclidean distance cost surface. The second step

involves building proximity surfaces using the Euclidian Distance surface as cost surface, and then

adding these surfaces to yield the least-cost path. As before, several low values may have to be

selected to yield a coherent path (Figure 6-2). The result is not convincingly perfect for the special

case, but a definite trend can be discerned.

Figure 6-2:Solving the multiple path problem in Fig 6-1 a and b, using Euclidean distance as cost surface

The reason for the imperfection of the left path lies in the generic structure of raster GIS.

When building the Euclidean cost surface by spreading from the destination point, only those cells


42/73



that are directly in line with the eight directions will have the true distance, all other cells will have an

approximated distance. A further step towards delineating the optimal path would be to create a cost

surface with true Euclidean distance.

For the right path an optimal solution is clearly determinable, even when not using true Euclidean

distance as cost surface.

To build a cost surface with true Euclidean distance, the following steps would have to be taken:

1. Export multiple path layers as XYZ file, edit in Visual Basic so that two files are created, XYX

and XYY, containing the X-coordinates and Y-coordinates of the cells respectively.

Re-import into MFworks

2. Create two layers with same extent and resolution as the multiple path layers, where all cell

values are set to the X-coordinate and Y-coordinate respectively for the destination point.

3. Use Pythagoras to create a fifth layer, where the cell values are the true Euclidean distance

between the cell and the destination point.

Alternatively, instead of creating two layers with X and Y for the destination point, the

coordinate values can be extracted as single values and used in a mathematical operation

on the XYX and XYY layers directly.

Another method, setting Euclidean distance aside, is to use the multiple path surface as input cost

surface, and trace the least cost path down this surface. This, of course, is only applicable to the

multiple paths in Fig 6-1b.

In terms of algorithm, tracing the least-cost path may then be described as follows:

1. Follow the selected lowest values delineating the optimal path (fanning in many directions)

from start to destination.

2. When a junction appears, spread out in a breadth-first search,

for all paths p 1 to p n do the following:

for all separate paths, compare p 1 cell n+1 with p 2 cell n+1 with p n cell n+1,

visit only cells that have not been visited before

if all cell values are equal, repeat this step till the following occurs:

if p n cell n+1 indicates a non-selected cell, terminate this path;

if the cell values are unequal, keep the path with the lowest value;


43/73



terminate other paths;

continue on this path till next junction and repeat step 2

3. Continue until destination.

The result of this algorithm applied to the path in Figure 6-1b is shown below:

Figure 6-3:Solving the multiple path problem in Fig 6-1(b) using the multiple path surface as cost surface

6.5 Continuously updated cost surface

In general, this approach may only be valid if there are no expected abrupt fluctuations in cost-of-

passage during the passage or if the length of passage is short enough not to be influenced by any

fluctuations that could have occurred during the passage.

The method of using multiple surfaces for various cost-inducing variables would imply external Visual

Basic or other applications for each of those variables. This is computationally viable, albeit not

desirable. Multiple calculations and operations will have to interact at the same time, possibly

causing errors.

Another option, an external database, holding all variables, which then transfers one calculated value

only to a collapsed cost-of-passage surface, is computationally easier to implement. Bearing in mind

the database-linking capabilities of Visual Basic, this option is preferable to the previous one using

multiple cost variable surfaces.


44/73



6.6 Time interval cost surface(s)

As was demonstrated earlier, using time intervals can lead to extensive computation. The more

complex the network, and the more narrow the time intervals, the more iterative steps have to be

computed. Given the fact that the network surface in raster GIS is a mere approximation rather thanan exact model of a real-world network, it is questionable whether using time intervals necessarily

will lead to better results in any case.

Further to this, average speed does not change abruptly and instantly at the borderline of two time

intervals. It is a smooth transition, and consequently this must be reflected in calculating the overall

cost of passage. Consequently, using time intervals demands a weighting of the successive time

intervals to gain the appropriate time and speed for the overall route.

With reference to the above, here it seems appropriate to mention Horn (1999), who propounds an

algorithm that calculates an approximation of shortest path travel time, independent of the particular

navigation between nodes. Although his algorithm leans on graph theory and vector GIS, an

implementation in a raster environment, being an approximation by it's own nature, could prove a

fruitful ground for further research.

6.7 External Visual Basic application

The Visual basic application developed along with this research was designed with a particular task

in mind, namely manipulating the attribute values of maps. The user-interface of this application is of

course a first and simple version. As such, it performs the task it is set to do. Nonetheless, the Visual

Basic application developed during the course of this research proved to be versatile and thus

extendable to more than one function only, as it was also used in conjunction with deriving the

Euclidean distance surfaces.

6.8 Meeting the objectives

The first objective was to prove the feasibility of network analysis in raster GIS, using Tomlin's map

algebra operators. As this study has shown, and bearing in mind the limitation

jan husdal thesis msc in gis

Documents