deriving incline for street networks from voluntarily ...€¦ · gps traces, the incline was...

96
in cooperation with: GIScience Group Institute of Geography Faculty of Chemistry and Earth Sciences Methods of Geoinformation Science Institute of Geodesy and Geoinformation Science Faculty VI Planning Building Environment MASTERS THESIS Deriving incline for street networks from voluntarily collected GPS traces Submitted by: Steffen John Matriculation number: 343372 Email: [email protected] Supervisors: Prof. Dr.-Ing. Marc-O. Löwner (TU Berlin) Dr.-Ing. Stefan Hahmann (Universität Heidelberg) Submission date: 24.07.2015

Upload: others

Post on 22-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

in cooperation with:

GIScience Group

Institute of Geography

Faculty of Chemistry and Earth Sciences

Methods of Geoinformation Science

Institute of Geodesy and Geoinformation Science

Faculty VI Planning Building Environment

MASTER’S THESIS

Deriving incline for street networks from

voluntarily collected GPS traces

Submitted by: Steffen John

Matriculation number: 343372

Email: [email protected]

Supervisors: Prof. Dr.-Ing. Marc-O. Löwner (TU Berlin)

Dr.-Ing. Stefan Hahmann (Universität Heidelberg)

Submission date: 24.07.2015

Page 2: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

ii

Declaration of Authorship

I, Steffen John, declare that this thesis titled, 'Deriving incline for street networks from voluntarily

collected GPS traces’ and the work presented in it are my own. I confirm that:

This work was done wholly or mainly while in candidature for a research degree at this Uni-

versity.

Where any part of this thesis has previously been submitted for a degree or any other qualifi-

cation at this University or any other institution, this has been clearly stated.

Where I have consulted the published work of others, this is always clearly attributed.

Where I have quoted from the work of others, the source is always given. With the exception

of such quotations, this thesis is entirely my own work.

I have acknowledged all main sources of help.

Where the thesis is based on work done by myself jointly with others, I have made clear exact-

ly what was done by others and what I have contributed myself.

Signed:

Date:

Page 3: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

iii

Abstract

The knowledge of incline is useful for many use-cases in navigation for electricity-powered vehicles,

cyclists or mobility-restricted people (e.g. wheelchair users). Digital elevation models (DEMs) such as

from laser scanning obtained DEMs or SRTM are either too expensive, not globally available or not

accurate enough. Therefore, voluntarily collected GPS traces collect by users of the OpenStreetMap

project have been used to derive the incline of a street network. Due to a high relative accuracy of the

GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison

with the SRTM DEM has shown that the inclines calculated with GPS perform slightly better with a

standard deviation of σGPS = 1.6 % (σSRTM = 3.1 %), considering street with at least 5 GPS traces.

Contrary to SRTM with a full coverage, the incline could only be derived for 18 % of the street

network (> 5 traces).

Kurzfassung (Abstract in German Language)

Steigungsinformationen haben einen Mehrwert für viele Routing Anwendungen, zum Beispiel für das

Routing von elektronisch betriebenen Fahrzeugen, Radfahrenden oder Menschen mit Mobilitätsein-

schränkungen (z.B. Rollstuhlfahrende). Digitale Geländemodelle (DGM), wie durch Laserscanning

erstellte DGMs oder SRTM-1 DGM, sind entweder zu teuer, nicht flächendeckend vorhanden oder

unzureichend in der Auflösung und der Genauigkeit. Daher sollen nutzergenerierte GPS Trajektorien

genutzt werden um die Steigung von Straßen zu berechnen. Aufgrund der festgestellten hohen

relativen Genauigkeit der Trajektorien war es möglich die Steigung in einer für viele Anwendungen

ausreichenden Genauigkeit zu berechnen. Der Vergleich mit dem SRTM DGM hat ergeben, dass die

Steigungen aus GPS Daten mit einer Standardabweichung von σGPS = 1,6 % besser sind

(σSRTM = 3,1 %). Für die Ermittlung der Standardabweichung wurden ausschließlich Straßen mit

mindestens 5 GPS Trajektorien verwendet. Im Gegensatz zu SRTM konnten die Steigungen nicht für

alle, sondern nur für 18 % der Straßen bestimmt werden (mit mehr als 5 Trajektorien).

Page 4: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

iv

Table of Content

1 Introduction ................................................................................................... 1

1.1 Motivation .............................................................................................................................. 1

1.2 Objectives .............................................................................................................................. 3

1.3 Outline ................................................................................................................................... 4

2 Background .................................................................................................... 5

2.1 Global Navigation Satellite Systems ..................................................................................... 5

2.1.1 GPS Setup and Determination of Location ................................................................ 5

2.1.2 Error Sources .............................................................................................................. 7

2.1.3 GLONASS, Galileo and Beidou ................................................................................. 9

2.2 Volunteered Geographic Information .................................................................................. 10

2.2.1 Terminology and Nature of VGI .............................................................................. 10

2.2.2 Classification and Examples ..................................................................................... 12

2.3 OpenStreetMap .................................................................................................................... 13

2.3.1 Introduction to Project .............................................................................................. 13

2.3.2 Data Model ............................................................................................................... 15

2.3.3 Incline Information in OpenStreetMap ..................................................................... 16

2.4 Data Mining ......................................................................................................................... 17

3 Related Work ............................................................................................... 19

3.1 3D Routing ........................................................................................................................... 19

3.1.1 Wheelchair routing ................................................................................................... 19

3.1.2 Energy-efficient routing ........................................................................................... 20

3.2 Extraction of Street Attributes from user-generated Movement Trajectories ...................... 21

3.3 Derivation of 3D information, using high-accurate GPS measurements ............................. 22

3.4 Map Matching ...................................................................................................................... 23

3.4.1 Categorization of Map Matching Algorithms ........................................................... 24

3.4.2 Functionality of Selected Algorithms ....................................................................... 24

3.5 Smoothing of Time Series Measurements ........................................................................... 26

4 Methodology ................................................................................................ 28

4.1 Definition of Pilot Region .................................................................................................... 28

4.2 Tools .................................................................................................................................... 29

4.3 Data ...................................................................................................................................... 29

4.3.1 Crowdsourced GPS traces ........................................................................................ 29

4.3.1.1 Platforms and Devices ................................................................................ 30

4.3.1.2 The GPX Format ........................................................................................ 31

Page 5: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

v

4.3.1.3 OpenStreetMap GPS traces ........................................................................ 32

4.3.1.4 Typical Errors ............................................................................................. 34

4.3.2 Street Network .......................................................................................................... 35

4.3.3 Land Use Information ............................................................................................... 36

4.3.4 Digital Elevation Models .......................................................................................... 37

4.4 Workflow and Implementation ............................................................................................ 39

4.4.1 Data Import ............................................................................................................... 40

4.4.1.1 GPS traces .................................................................................................. 40

4.4.1.2 OSM Street Network and Land Use Information ....................................... 43

4.4.2 Preprocessing ............................................................................................................ 43

4.4.2.1 GPS data ..................................................................................................... 43

4.4.2.2 Street Network ........................................................................................... 46

4.4.3 Map Matching........................................................................................................... 47

4.4.4 Calculation of Incline ............................................................................................... 52

4.5 Validation ............................................................................................................................. 55

5 Discussion of Results ................................................................................... 57

5.1 Analysis of Crowdsourced GPS traces ................................................................................ 57

5.1.1 Vertical Absolute and Relative Accuracy ................................................................. 57

5.1.1.1 Absolute Accuracy ..................................................................................... 58

5.1.1.2 Relative Accuracy ...................................................................................... 60

5.1.2 Coverage and density ................................................................................................ 62

5.2 Analysis of Calculated Incline ............................................................................................. 65

5.2.1 Exclusion of data from the evaluation ...................................................................... 66

5.2.2 Accuracy of GPS incline .......................................................................................... 66

5.2.2.1 Overall error ............................................................................................... 67

5.2.2.2 By Land Use Classes .................................................................................. 69

5.2.2.3 By Terrain Classes (mountainous / flat) ..................................................... 70

5.2.2.4 Effect of Number of GPS Traces on Overall Accuracy ............................. 71

5.2.3 Comparison GPS incline and SRTM incline ............................................................ 72

5.2.3.1 By Land Use Classes .................................................................................. 73

5.2.3.2 By Terrain Classes ..................................................................................... 75

5.3 Limitations of Approach ...................................................................................................... 75

6 Conclusion and Outlook ............................................................................. 77

6.1 Conclusion ........................................................................................................................... 77

6.2 Outlook ................................................................................................................................ 81

7 Bibliography ................................................................................................ 83

Page 6: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

vi

List of Figures

Figure 1: A steep slope of a street or path may be inaccessible for wheelchair users. (© Flickr-

user: ‘Transguyjay’) ................................................................................................................ 1

Figure 2: Depending on the street, 0 to many GPS track points fall into one square of 1’’ × 1’’

equivalent to horizontal resolution of SRTM-1. Due to the projection, the grid is not

squared. (Map: OSM) ............................................................................................................. 3

Figure 3: Determination of a 2D position with three satellites (©Anja Köhn, Michael Wößner) .......... 7

Figure 4: How the satellite constellation influences precision. In (a) the transmitters are

orthogonal, which keeps the error region small. If the transmitters are closer

together, the error region gets larger (b). (Langley 1999) ....................................................... 8

Figure 5: The Multipath and Shadowing effect (Conley et al. 2006, p. 280) .......................................... 9

Figure 6: Density map of OpenStreetMap nodes (© OpenStreetMap wiki-user ‘Tyr’) ........................ 12

Figure 7: OSM data model for map feature (left) and file system for GPX files (right) (adopted

Ramm & Topf 2010, p. 56) ................................................................................................... 16

Figure 8: Map Matching. The GPS trace (blue) is snapped to the street network (red) (Map:

OSM) .................................................................................................................................... 23

Figure 9: Example of 'Median of 3' -smoothing with the raw data (row 1) and the results using

the single median smoothing and the repeated meadian smoothing. (Tukey 1977,

p. 212) ................................................................................................................................... 27

Figure 10: Pilot region Heidelberg / Germany. (Map: OSM) ............................................................... 28

Figure 11: Example GPX file. ............................................................................................................... 31

Figure 12: Screenshot of grid map, shown the number of GPS points per grid cell. ............................ 32

Figure 13: Elevation profile of a GPS trace, recorded on a flat street. .................................................. 34

Figure 14: GPS traces with lost GPS-signals in tunnels. (Map: OSM) ................................................. 34

Figure 15: Difference of DSM and DTM. ............................................................................................. 38

Figure 16: Process of deriving incline information out of user-contributed GPS traces. ...................... 39

Figure 17: Filtering and import of GPS traces. ..................................................................................... 41

Figure 18: The schema of the relation 'gpx_data_line' for storing the GPS traces. ............................... 41

Figure 19: Flowchart of preprocessing the GPS traces ......................................................................... 44

Figure 20: Columns of the relation, which stores the preprocessed GPS traces. .................................. 45

Figure 21: Schema of relation 'streets'. .................................................................................................. 46

Figure 22: Enhancement of street network with land use information in cases, where land use

polygon does not cover the street segment. .......................................................................... 47

Figure 23: Flowchart of map matching process. ................................................................................... 47

Page 7: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

vii

Figure 24: The map matching process: Select candidate traces with buffer (light green) of

street (dark green) (a), create profile lines (blue) (b), select traces (red) which

intersect at least 70 % of the profile lines. ............................................................................ 49

Figure 25: Example for two parallel street, which are do not have the same incline. ........................... 49

Figure 26: The tables 'gpx_data_line', 'streets_gpx' and 'streets' and their relation to each other. ........ 50

Figure 27: Properties file of map matching tool .................................................................................... 51

Figure 28: Workflow for calculating the incline of street segments. .................................................... 52

Figure 29: Clipping of assigned GPS traces. ......................................................................................... 53

Figure 30: Screenshot of visualized GPS track points, colorized according to their elevation.

(green=low, red=high) .......................................................................................................... 57

Figure 31: Vertical accuracy of crowdsourced GPS traces, distinguished by land use class. ............... 58

Figure 32: Histogram with the differences of GPS and DTM elevation ............................................... 60

Figure 33: Relative accuracy of crowdsourced GPS track points, overall and distinguished by

land uses. ............................................................................................................................... 61

Figure 34: Map, showing the coverage of the streets with GPS traces. (Map: OSM) ........................... 63

Figure 35: The coverage with GPS traces for different street types. ..................................................... 64

Figure 36: Average distance of two adjacent GPS track points differentiated by street type. .............. 65

Figure 37: Visualization of the GPS incline. Streets with no coverage are not displayed. (Map:

OSM) .................................................................................................................................... 65

Figure 38: Erroneously calculated DTM incline, due to irregularities of the LiDAR DTM. ................ 66

Figure 39: Visualization of the error of GPS incline in the pilot region. (Map: OSM) ......................... 67

Figure 40: Histogram of the overall incline error in percent and the bell-curve (red). ......................... 68

Figure 41: The percentage of streets, with an incline error smaller than 2 % and their share

with respect to the entire street network. .............................................................................. 72

Figure 42: Situations where the calculated incline differs from the steepest incline. ........................... 76

Page 8: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

viii

List of Tables

Table 1: Categories of VGI project according to Jokar Arsanjani (2014) ............................................. 12

Table 2: Usage of the key 'incline' and its values .................................................................................. 16

Table 3: Overview of visibility options for the upload of GPS traces................................................... 33

Table 4: Values of highway tag and their share of length in percent. ................................................... 35

Table 5: OSM landuse-tags and their characteristics. ........................................................................... 37

Table 6: The effect of the relative accuracy on the calculated incline. ................................................. 62

Table 7: The length of street segments for different incline error classes. ............................................ 69

Table 8: The achieved accuracy of GPS incline differentiated by land use classes. ............................. 70

Table 9: The achieved accuracy of GPS incline differentiated by terrain classes. ................................ 71

Table 10: Comparison of SRTM and GPS incline in terms of amount of street network with an

incline error smaller than 2 %. .............................................................................................. 73

Table 11: Comparison of the standard deviations of the incline error, overall and differentiated

by land use classes. ............................................................................................................... 74

Table 12: Comparison of the standard deviations of the incline, overall and differentiated by

terrain classes. ....................................................................................................................... 75

Page 9: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

1 Introduction

1

1 Introduction

1.1 Motivation

Common routing and navigation systems such as Google Maps1 or Here

2 do not consider elevation

or incline information in the calculation of directions. This is due to the fact that they were initially

designed for the calculation of directions for cars or other fuel-powered vehicles, which do

generally not benefit of incline information. Many other use-cases exist, in which one would

appreciate the knowledge about the incline of streets. For cyclists, pedestrians and especially for

mobility-restricted people the incline of a planned route is of high relevance (cf. Figure 1). Some

cyclists might prefer to take a slightly longer but less steep route, while other cyclists may prefer

inclined streets due to training reasons. Even more relevant is the incline information for mobility-

restricted people such as wheelchair users, people with walking aids or parents with push-chairs.

For this group of people steep streets or paths may be inaccessible, since they can only pass

inclines up to a certain percentage uphill or downhill. Obviously, the magnitude of incline which

can be passed by people with walking-aids highly depends on the disability and the type of

wheelchair (manual / electric). Moreover, electric wheelchairs or in general electricity powered

vehicles have a higher energy demand when going uphill and a limited battery capacity. In

addition, charging stations are still rare. Therefore, incline information can be utilized by routing

services to compute the most efficient route in terms of power consumption (cf. Franke et al. 2012).

Figure 1: A steep slope of a street or path may be inaccessible for wheelchair users. (© Flickr-user:

‘Transguyjay’3)

The GIScience Research Group of the University of Heidelberg (and other partners) is currently

working on a project to extend and improve the OSM routing service OpenRouteService.org to

include accessibility related data, out of which the motivation for this thesis arose. The EU-project

1 http://google.de/maps, checked on 15/07/2015

2 http://here.com, checked on 15/07/2015

3 Source of image: https://www.flickr.com/photos/jayw/2604877785, checked on 15/07/2015

Page 10: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

1 Introduction

2

is called CAP4Acess which is an acronym for ‘Collective Awareness Platforms for Improving

Accessibility in European Cities and Regions’. Due to this project the cooperation between the

Institute of Technology Berlin (Technische Universität Berlin) and the University of Heidelberg

was established for this thesis. The aim of the project is to develop methods and tools for collec-

tively gathering and sharing information about the accessibility of public spaces. The project

focuses on different topics, including for example “Collective tagging”, “Participatory sensing” and

“Routing and navigation”. There are four pilot regions for this project in Vienna (Austria), London

(UK), Elche (Spain) and Heidelberg (Germany) (cf. empirica Gesellschaft für Kommunikations-

und Technologieforschung mbH 2015).

For the calculation of the incline of a street, different types of digital elevation models (DEMs)

may be used. The most accurate possibility is using a high resolution DEM, acquired from airborne

laser detection and ranging (LiDAR). This method is very expensive, therefore, open-licensed

DEMs may be an alternative. Depending on the Open-Data strategy of the authorities, high-

resolution DEMs are available for some regions (cf. OpenStreetMap Wiki 2015d). There are also

open-licensed DEMs which are almost globally available, SRTM and ASTER GDEM. The SRTM

DEM is acquired from the Shuttle Radar Topography Mission (SRTM) and is available online with

a horizontal resolution of 1 arc second (30 m) and an absolute elevation error of 6.2 m (cf. Farr et

al. 2007). The ASTER Global DEM was compiled from data collected by the ‘Advanced Space-

borne Thermal Emission and Reflection Radiometer’ (ASTER), mounted on the Terra spacecraft.

The global DEM has a horizontal resolution of 30 m (1 arc second) and a vertical accuracy of

approximately 9 m (cf. Meyer 2011).

Open-licensed DEMs with high accuracy are not globally available, whereas those DEMs which

are nearly globally available suffer from a poor horizontal resolution and vertical accuracy.

Especially for hilly or mountainous regions and high-resolution scenarios this data might not be

sufficient to derive the incline of streets with an acceptable accuracy. Therefore, I propose a

method to derive incline information from GPS traces, contributed by users of the OpenStreetMap

project.

Due to the fast development of mobile phones with integrated GPS receivers, GPS traces can easily

be recorded by everybody. According to Liu et al. (2014), a positional accuracy of 5 to 10 meters

and a vertical accuracy of up to 25 meters can be expected from GPS traces collected by handheld

GPS devices or smartphones. Admittedly, the vertical accuracy is very poor, however, for

calculating the incline only the elevation differences of two adjacent points are relevant. If a

GPS trace was recorded within a short time span in an open area it may be assumed that all points

of the trace are recorded under similar atmospheric influences and with a similar satellite constella-

tion. Therefore, it may be expected that the track points of one GPS trace have a similar absolute

Page 11: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

1 Introduction

3

error and consequently a fairly good relative accuracy. In addition, the coverage of multiple

GPS traces per street as well as a relative high density of GPS track points, may compensate a poor

accuracy. Figure 2 shows the grid of 1 by 1 arc second which is equivalent to the horizontal

resolution of SRTM DEM as well as the GPS track points extracted from the OpenStreetMap GPS

data. It can be seen that many GPS track points fall into one square, for most of streets, but there

are also some streets with none or just a few GPS track points.

Figure 2: Depending on the street, 0 to many GPS track points fall into one square of 1’’ × 1’’ equivalent to

horizontal resolution of SRTM-1. Due to the projection, the grid is not squared. (Map: OSM)

1.2 Objectives

The objective of this thesis is to develop methods and tools to calculate the incline of a street

network, including paths for pedestrians and cyclists. The incline shall be calculated out of

GPS traces which are collected by contributors the OpenStreetMap project, since this may

represent a low-cost alternative to expensive high-accuracy DEMs. The GPS data is a collection of

GPS traces, collected by thousand users with different devices and transportation mode. The device

and transportation is not given in the data, which makes it difficult to judge the accuracy and

density of GPS track points. Therefore, the GPS raw data shall be assessed with regard to accuracy,

coverage and density. Furthermore, the incline calculated from GPS traces shall be validated, using

a high-accuracy DEM, obtained from LiDAR measurements, to see how accurate the incline was

calculated. As described in the motivation, globally available DEMs represent, also represent as

alternative to high-accuracy DEMs for deriving incline. Thus, the incline, calculated from

GPS traces shall also be compared to the incline derived from the SRTM-1 DEM. It is intended that

due to a higher density of elevation information, a higher accuracy will be achieved with user-

generated GPS traces. The tools developed for the purpose of this thesis, shall be published and

provided to the OpenStreetMap community, since tools for processing GPS data are still rare.

Page 12: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

1 Introduction

4

To summarize, the aims of this thesis can shortly be formulated as follows:

- Creation and implementation of a workflow to calculate the incline of streets, using user-

contributed GPS traces.

- Assessment of the quality of voluntary collected GPS traces in terms of

o vertical accuracy (absolute and relative)

o coverage of GPS traces

- Assessment of the achieved quality of the incline information, compared to LiDAR and

SRTM-1 DEM.

- Publication of developed software as Open Source and provision to the OpenStreetMap

community

1.3 Outline

The thesis will be structured as follows. In chapter 2 background information, which are important

this topic, will be discussed. This involves the topics, Global Navigation Satellite Systems,

Volunteered Geographic Information, the OpenStreetMap project and data mining. In chapter 3,

different researches about related topics are presented. The methodology of this research is

described in chapter 4, which includes the used data and tools as well as all the steps of deriving

incline from user-generated GPS traces. The outcome of the methods, described in chapter 4, will

in chapter 5 be judged and discussed using statistical methods under the consideration of a high-

accuracy DTM on the one hand and the low-cost alternative SRTM-1 DEM on the other hand.

Furthermore, chapter 5 includes the quality assessment of the GPS-data. In chapter 6 this thesis will

be summarized and concluded and ideas on how to progress with this topic in the future will be

given.

Page 13: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

5

2 Background

In the following chapter background information related to this research shall be given. Firstly, the

Global Positioning System and other Global Navigation Satellite Systems will be introduced and

their functionality explained, since GPS traces are one of the major data sources of this research.

The traces are collected voluntarily; therefore an overview of volunteered geographic information

VGI is given. After that, one of the most popular VGI projects, OpenStreetMap, will be introduced.

At the end of this chapter, the terms data mining and spatial data mining will be discussed.

2.1 Global Navigation Satellite Systems

Nowadays, Global Navigation Satellite Systems (GNSS) are an essential part in the field of

navigation and positioning. With GNSS it is possible to determine any location on the earth’s

surface with fairly good accuracy. Since this research is about mining information from movement

trajectories recorded with the help of such systems, an overview of which systems exist and how

they work shall be given. Several countries either operate a GNSS or are currently building one,

however this section shall cover the set up and functionality of the Global Positioning System

(GPS) only, since this is the first and most stable GNSS. GPS is the GNSS of the United States of

America. Furthermore, the method for determining a location will be explained, followed by an

overview of error sources and their impact on the accuracy. An overview of other GNSS is given at

the end in section 2.1.3.

2.1.1 GPS Setup and Determination of Location

The Global Positioning System (GPS) is developed and operated by the U.S. Department of

Defense. It was initially developed for military reasons, and in the beginning the accuracy was

degraded for civilian use. This is known as Selective Availability (SA). In 2000, the degradation of

accuracy was switched off, which now offers higher accuracy to civilian users. This enabled the

realization of many standard applications, such as the private use of so-called Location-Based-

Services (cf. Hofmann-Wellenhof et al. 2008, pp. 309–311).

For the GPS set up, three segments play an important role. These are the space segment, the control

segment and the user segment. The space segment consists of 24 active and several spare satellites.

The active satellites are spaced in six orbits with an altitude of 20,200 km. The control segment

consists of several control stations distributed around the earth. The tasks of the control segment

are, among others, to track the satellites for the determination of their orbit and the synchronization

of the atomic clocks, mounted on the satellites. The user segment is referred to as the receiver,

which receives the signals emitted by the satellites and calculates the current location. Further

Page 14: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

6

information on the GPS segments can be taken from Hofmann-Wellenhof et al. (2008, pp. 322–

327).

As already mentioned, the position of the satellites at a certain time is known through the orbital

parameters, which are observed by the control segment. The satellites are constantly emitting

signals which can then be received by the receiver. The signal contains two carrier waves, L1 with

a frequency of 1575.42 MHz and L2 with 1227.60 MHz. Upon both waves, codes are modulated

which represents a message containing the information about the satellite such as orbit parameters

and time of signal emission. While on L1 both the C/A-code (coarse/acquisition) and P-code

(precision) are modulated, L2 only carries the P-code. The combination of the P-code from two

carrier waves allows a higher positional accuracy through the elimination of ionospheric influ-

ences. In addition, the P-code is encrypted to ensure that it is only available for authorized users,

like the military (cf. Hofmann-Wellenhof et al. 2008, pp. 315-322)

Hofmann-Wellenhof et al. (2008, pp. 161-191) mention three mathematical models for position-

ing. These are single point positioning, differential positioning and relative positioning. Single

point positioning and differential positioning will be described below. Relative positioning is not

relevant for this research. For further information on this, see Hofmann-Wellenhof et al. (2008,

pp. 173-191). Single point positioning is applied when determining the position using smartphones

or other handheld devices. When using this method, the pseudoranges between the satellites and the

receiver are determined. This can be done by either using the code modulated on the carrier waves,

using the phase of the carrier wave, or based on Doppler data. Here, only the first approach will be

explained, since common smartphones or other handheld devices make use of the code. To

determine the 2D position (X,Y) of a location, the pseudoranges of at least three satellites are

necessary. Two of them are used to calculate the distance between satellites and receiver by

multiplying the time of travel by the speed of light. To determine the time of travel, it is necessary

for the receiver’s clock to be synchronized with the satellite clock. Since this is not the case prior to

measuring, the pseudorange to the third satellite must be known in order to correct the clock bias.

This is depicted in Figure 3. The solid lines show the pseudorange to the satellites before the clock

correction. It can be seen that there are three intersection points which are possible locations for the

receiver, depicted as ‘B’. After the correction of the clock and the correction of the pseudoranges

(dashed line) which follows, only one intersection point remains (A).

Page 15: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

7

Figure 3: Determination of a 2D position with three satellites (©Anja Köhn, Michael Wößner4)

When determining a 3D position (X,Y,Z), four instead of three satellites are necessary, since there

is one more unknown. Considering all measurements, a non-linear equation system as shown in

Hofmann-Wellenhof et al. (2008, p. 162) can be solved to determine the unknown coordinates of

the receiver’s location. According to Cosentino et al. (2006, p. 379), a horizontal accuracy of

around 10 m can be achieved in 95 % of cases when applying single point positioning with one

frequency. This is because the measurements are influenced by several factors which will be

discussed in section 2.1.2. To omit some of the influences and thereby improve the accuracy of the

measurement, differential point positioning may be performed. To do so, two receivers are needed,

a reference receiver and a remote receiver. The coordinates of the reference station are known and

considered as true value. Consequently, the error of the observed pseudoranges can be determined.

The observed error can then be transmitted to the remote receivers and will be used to correct the

pseudoranges (cf. Hofmann-Wellenhof et al. 2008, p. 169).

2.1.2 Error Sources

As already mentioned the measurements are influenced by errors arising from different sources.

Hofmann-Wellenhof et al. (2008) categorize the errors according to their sources, namely satellite,

signal propagation and receiver.

Satellite:

Errors originating from the satellite are the satellite clock bias and orbital errors. The highly-

accurate atomic clocks of the satellites are controlled and frequently updated by the control

segment on earth in order to synchronize the satellites among themselves. The clocks get an update

once a day, and therefore the clock error is small immediately following the update and increases

until the next update. In addition to the clock biases, errors are also contained in the ephemeris

4 Source of image: http://www.kowoma.de/gps/Positionsbestimmung.htm, checked on 15/07/2015

Page 16: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

8

data, transmitted to the receiver. The ephemeris data contains information about the satellite’s orbit

and is used to calculate the satellite’s position. The orbital parameters are estimated and may differ

from the actual orbit of the satellite (cf. Conley et al. 2006, pp. 304 f.).

In addition to the aforementioned error sources originating from the satellite, precision also

depends on the satellite constellation in the sky. The effect is shown in Figure 4 in the case of a

simple ranging system with two transmitters. When the rays of the two transmitters (satellites) have

an intersection of 90 °, the region in which the receiver may lie is relatively small (Figure 4a). If

the transmitters are closer together as shown in Figure 4b, the region becomes larger and with it the

uncertainty of the location determination. This is the reason why the vertical accuracy is generally

worse than the horizontal one. In case of the horizontal coordinates, the satellites may be in good

constellation, meaning that there is a satellite in every direction, keeping the error region small. In

case of the vertical coordinate, all satellites are above the receiver and therefore only in one

direction. (cf. Langley 1999)

Figure 4: How the satellite constellation influences precision. In (a) the transmitters are orthogonal, which keeps

the error region small. If the transmitters are closer together, the error region gets larger (b). (Langley 1999)

Signal Propagation:

During the propagation of signals through the atmosphere, a delay occurs. The ionosphere, which is

the layer from approximately 50 km to 1000 km above the earth, is a dispersive medium. The

dispersion is dependent on the frequency. Thus, it is possible to correct the ionospheric influences

when applying a dual-frequency single point positioning. The different frequencies L1 and L2 (cf.

2.1.1) have a different delay, or in other words, a different propagation speed. A correction can

therefore be determined (cf. Conley et al. 2006, p. 161).

Page 17: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

9

Receiver:

Errors caused on the receiver side are, among others, the multipath effect and shadowing. Both are

depicted in Figure 5. The multipath effect occurs when in addition to the direct signals, reflected

signals from the surfaces of nearby structures are also captured by the receiver. This leads to errors

in the calculation of the pseudorange, since the reflected signal traveled a longer way and conse-

quently took more time. Shadowing occurs when the view from the receiver to the satellite is

shadowed by trees or roofs. As a result, the signal reaches the receiver either with low energy or

not at all and cannot be used for positioning. Multipath and shadowing can also occur in combina-

tion, as shown in Figure 5. The signal reflected on the building is received with higher energy than

the signal shadowed by the canopy. Multipath and Shadowing effects are random and highly

dependent on the time and the receiver’s location. The error caused by these effects can be high in

magnitude and can sometimes be the main contributor to the error in comparison to the other error

sources (cf. Conley et al. 2006, pp. 279-280). User-generated GPS traces are recorded without

consideration of such effects, also on location where multipath and shadowing effects have a big

share of the error. This is mainly the case in urban areas with high buildings or in forested areas.

Figure 5: The Multipath and Shadowing effect (Conley et al. 2006, p. 280)

2.1.3 GLONASS, Galileo and Beidou

In addition to GPS, other GNSS worth considering include, GLONASS, Galileo and Beidou.

GLONASS is operated by Russia and is, like GPS, fully operational with 21 active and 3 spare

satellites in three orbital planes (cf. Hofmann-Wellenhof et al. 2008, pp. 348-349). Since 1996,

when GLONASS was fully operational the first time, several satellites failed over the years and

GLONASS could not be operated with the full coverage. Several new satellites were launched, but

this was not enough to maintain the full constellation (Feairheller & Clark 2006). Nowadays,

GLONASS is again fully operational. Galileo and Beidou are still under construction and do not

have world-wide coverage yet. While Galileo, the European answer to GPS and GLONASS, has

Page 18: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

10

only four satellites launched, the Chinese Beidou consists of 14 satellites and already operates in

the Asian-Pacific regions. It is planned that Beidou will reach its full constellation with 35 satellites

in different orbits by 2020. Once this happens, it will then have a world-wide coverage, similar to

GPS and GLONASS (Santerre et al. 2014). Galileo is a project of the European Union and

European Space Agency to build a GNSS, similar to GPS and GLONASS, but under civilian

control. The development of the system was initiated in 1994. The first idea was to cooperate with

the United States to develop a “next-generation” GPS, however the United States did not wish to

cooperate with foreign countries. Therefore, it was decided to build up a new and independent

GNSS, which is interoperable with the existing GPS system and GLONASS. Consequently,

receivers could use the three systems in combination. Moving forward, it will be possible to

achieve higher accuracies, since more satellites are involved in the positioning process (cf.

Hofmann-Wellenhof et al. 2008, pp. 365-367). Galileo is currently still under construction and it is

planned to be fully operational by 2020. Two satellites were launched in both 2011 and 2012. With

a total number of four satellites, first tests of the system were then possible. When the system is

fully operational, a total of 30 satellites will be orbiting around the earth at an altitude of

23,222 km. Out of the 30 satellites, 27 will be actively used and three will be available as replace-

ments (European Space Agency 2015).

2.2 Volunteered Geographic Information

For this research voluntarily collected GPS and street level data is used. For this special type of

data the term ‘Volunteered Geographic Information” has emerged (Goodchild 2007). It describes a

special case of user-generated content (UGC). According to Bauer (2010) the term UGC has been

used since the mid-nineties for content in the internet, which is produced by the user. Due to the

fast development of technologies regarding the internet, it has become possible and affordable for

many users to have fast internet access. This development has made it possible for the user not only

to search the internet, but also to create new content. The term UGC is kept general intentionally,

since it may be any kind of media such as videos, pictures or text. When this data refers to a spatial

location, it is known as ‘Volunteered Geographic Information’. The terminology and characteristics

of VGI is discussed below, and examples of how VGI can be classified into groups are provided.

2.2.1 Terminology and Nature of VGI

The term ‘volunteered geographic information’ (VGI) was introduced by Goodchild in 2007. It

describes a phenomenon which was new in the field of geography at the time. Geographic

information is collected voluntarily by mostly untrained people without any financial compensa-

tion. He also calls this phenomenon “citizens as sensors”.

Page 19: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

11

It has also been referred to as ‘crowdsourcing geospatial information’ by Heipke (2010) and Ramm

et al. (2011). Sui (2008) describes the recent development as the ‘wikification of GIS’ and points

out, that the actors and methods of collecting geographic information has changed. Preciousy, only

experts like surveyors or cartographers were acquired and processed geodata, which was expensive.

Nowadays, there is a large amount of data freely available and the people who are acquiring and

processing the data are not necessarily experts anymore.

Resch (2013) distinguishes between different concepts of acquiring the data. In his paper, he

discusses the difference between the terms ‘citizen as sensor’, ‘collective sensing’ and ‘citizen

science’. While these concepts are closely related, there are differences worth noting. ‘People as

sensors’ describes the concept of people who collect information through subjective observations.

This might, for example, be the smoothness of a street surface or the water quality of lakes. The

term ‘collective sensing’ ”[…] analyses anonymized data coming from collective networks, such as

Flickr, Twitter, Foursquare or the mobile phone network” (Resch 2013). The third term, ‘citizen

science’, means that people contribute data, collected by sensors integrated in their smartphone or

other devices. In comparison to “people as sensors” this data is not subjective and only comes from

sensor measurements.

VGI is often collected with the help of low-cost GPS receivers, integrated in most smartphones or

other handheld GPS devices. With those devices the coordinates of a location can easily be

determined. The acquired coordinates or GPS traces may then be used, for example, to digitize the

outline of a street traveled or to mark points of interest at the measured location. Images may also

be georeferenced by adding coordinates. Another way of gathering information is digitizing

features from satellite imagery. (cf. Goodchild 2007)

Sester et al. (2014) provide an overview of characteristics of VGI. Volunteered Geographic

Information can be highly heterogeneous in terms of the quality and coverage. Depending on the

number of volunteers, some regions may be more complete than others. Figure 6 shows the density

map depicting the nodes available in OpenStreetMap, the most famous VGI-project. The brighter

the color is, the more nodes within that pixel. It can be seen, that developed regions with a high

population density like Europe and North America have more nodes than others. This may be

attributes to the fact that there are more people living in these places who are potential contributors,

while also considering that there may be more features to digitize. If there are more volunteers in a

region, the data is also more likely to be up-to-date. Especially in comparison to authoritative data

which is usually updated in certain cycles, VGI is updated whenever a volunteer detects a change

in the real world. Another characteristic of VGI is the heterogeneity with respect to semantic

information. In particular, when collecting topographic data in OpenStreetMap, there is no

standardized catalogue of features and their semantic information. The semantic information is

Page 20: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

12

added as key-value pairs, which are commonly discussed in the community, however, in practice a

user does not need to follow these agreements.

Figure 6: Density map of OpenStreetMap nodes (© OpenStreetMap wiki-user ‘Tyr’5)

2.2.2 Classification and Examples

There are plenty of projects which somehow deal with geographic information. Jokar Arsanjani

(2014) categorized the projects according to the purpose or type of data (topographic, images,

video, text) which is being shared. The categories are listed in Table 1.

World mapping projects Weather mapping Business mapping

Social media mapping Crisis and disaster mapping Transportation mapping

Environmental and ecological

monitoring Outdoor activity mapping Crime mapping and tracking

Table 1: Categories of VGI project according to Jokar Arsanjani (2014)

Hahmann (2014) extends this list with ‘encyclopedic projects’ such as Wikipedia6. For this thesis,

the most popular world mapping project, OpenStreetMap7, is of relevance as a data source for

street network and GPS traces. OpenStreetMap is about creating a world map from volunteers

under an open license. Here the contributors generate map features by digitizing recorded

GPS traces or satellite imagery. Next to digitized topographic data, raw GPS traces are also

collected within this project. In section 2.3 this project will be explained in more detail. Other

5 Source of image: http://wiki.openstreetmap.org/wiki/File:OSM-node-density-map-2013.png, checked on

15/07/2015 6 http://wikipedia.org

7 http://openstreetmap.org

Page 21: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

13

‘world mapping projects’ are, among others, Wikimapia, which similar to OpenStreetMap, aims to

mark geographical objects, and Google Map Maker8, which is operated by Google and was

initiated to improve the quality of Google Maps9.

Furthermore, ‘outdoor activity mapping’ projects, such as WikiLoc10

or GPSIES11

shall be

mentioned. These projects aim to collect GPS traces of outdoor activities undertaken by the

contributors. The purpose is to provide outdoor routes including additional information, such as

points of interest, distance or elevation profile. In addition, traces may be rated and can therefore be

used to search good outdoor routes, depending on the intention of the user. Those projects are a

potential data source of GPS traces for the derivation of incline values.

2.3 OpenStreetMap

This chapter briefly introduces the VGI-project OpenStreetMap, since the data of this project is

used for this research. Firstly, the project will be introduced in general and a short history will be

given. Secondly, how the geographic and semantic information is handled within this project will

be explained, followed by an assessment of how incline can be represented and how often it is

actually mapped.

2.3.1 Introduction to Project

The OpenStreetMap (OSM) project was founded by Steve Coast at University College London in

2004 and aims to create a freely and globally available map. Information like map features

including their semantic information are added and modified by the community. The way of

contributing data is typical for a VGI-project. In the first years of the project, data was exclusively

contributed by capturing the travelled path with a GPS-device, followed by the digitization of the

recorded route on a computer. For editing the maps, several editors are available, such as JOSM or

iD. These editors make it very convenient to load the GPS raw data and create geometries.

Furthermore, the editors handle the upload of the created features to the OSM database which

follows. In addition to the vector geometry of the map feature, the GPS raw data can also be

uploaded (Ramm & Topf 2010, pp. 3 f.). In 2007, the company Yahoo! allowed OSM-contributors

to use their aerial imagery for the digitization of map features. With the satellite imagery, it became

very easy to create features such as buildings, which are hard to measure using a handheld GPS

device. This also enables contributors to create features remotely, without being on-site or having

any local knowledge about the region (Haklay & Weber 2008). Three years later, in 2010,

Microsoft also provided their aerial imagery for the purpose of contributing to OpenStreetMap

8 https://google.com/mapmaker

9 http://google.com/maps

10 http://www.wikiloc.com/

11 http://www.gpsies.com/

Page 22: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

14

(OpenStreetMap Wiki 2015a). Other sources of data include donations from public agencies and

the integration of other open data. An example is the integration of the entire street network of the

Netherlands, after the donation of the company AND (Automotive Navigation Data) in 2007.

The data, stored in the OSM database is licensed under the Open Database License (ODbL). It

allows everybody to share the data, produce their own work from it and redistribute it, as long as

the new database is also published under ODbL and OSM and its contributors are attributed. (Open

Knowledge Foundation 2015). The OSM database has not always been under ODbL. From the

beginning of the project until September 2012, the data was licensed under the terms of the

Creative Commons – Attribution-ShareAlike (CC-BY-SA) license. CC-BY-SA was made for

creative works, such as music and pictures. Therefore, it could hardly be used for collections of

data or databases such as OpenStreetMap, since the mentioned terms are hard to interpret for

databases. Furthermore, it was not possible to mix data under CC-BY-SA with data under other

licenses. This was made possible with the change to ODbL. (OpenStreetMap Foundation Wiki

2015b) The process of changing the license was initiated by the OpenStreetMap Foundation. It is a

non-profit organization, founded in Great Britain, to support the OSM project with organizational

tasks. Among other things, the foundation hosts the OpenStreetMap servers, helps with collecting

donations for servers and supports the community with organizing events, like so-called mapping

parties or conferences. The OpenStreetMap Foundation also organizes working groups which act as

support in specific fields or topics. An Example is the Operations Working Group, which is

responsible for issues related to servers and the OSM API. (cf. OpenStreetMap Foundation Wiki

2015a, 2015c)

Over the years from the beginning of the project in 2004 to now, the OSM project has become

more and more popular. Haklay & Weber (2008) say that “[…] OpenStreetMap (OSM) is probably

the most extensive and effective project currently under development”. There are now over 1.9

million users registered, which contributed over 2.75 billion nodes and over 250 million line

objects. Worldwide, the users uploaded GPS traces providing approximately 4.5 billion track points

(OpenStreetMap Wiki 2015c).

As is typical for a VGI project the quality and completeness of such data may vary. Neis et al.

(2012) evaluated the OSM street network of Germany from 2007 to 2011 using a dataset of a

commercial provider. If only taking streets into account, which can be used for car navigation

(name or route number of streets is known), the street network of OSM is 9 % smaller than the data

from the commercial provider. But if the entire street network is considered, the OSM dataset is

27 % larger or even 31 % larger, if paths for pedestrians are considered. This means there are

streets or paths in OSM which do not exist in the commercial data set. The reason for this can be

found in the fact that OSM contains small hiking trails, paths or bicycle lanes, which are not

Page 23: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

15

relevant for the commercial map provider. Since this evaluation was made at the time of writing

four years ago, it can be expected that the completeness of the street network has now improved

even further. This shows the potential of the OpenStreetMap dataset and proves its suitability for

many applications. According to Neis & Zielstra (2014b) it has been proven in the past that the

OSM data can be used for various applications such as crisis management, mapping for different

purposes (hiking, public transport) or routing. There are several routing services for different

purposes, such as komoot12

for cycling and hiking or Skobbler13

for car navigation.

2.3.2 Data Model

As mentioned in section 2.3.1, the OpenStreetMap project stores both the digitized map features

and the GPS traces as raw data. While the map features are underlying a data model, the GPS traces

are stored in a file system as GPX-files14

(cf. Ramm & Topf 2010, pp. 317–318). The data model of

the map features follows a simple approach. Figure 7 shows the GPX file system next to the object

types available in OpenStreetMap and their relations to each other. A node is a representation of a

point on the earth’s surface, described by longitude and latitude. Ways are the representation of

linear features. Instead of being defined by a sequence of coordinates, a way object references up to

2000, but at least two, ordered nodes. Since the referenced nodes are ordered, the way is directed

from the first to the last node. If a line is closed (starting point is equal to the end point) the way

can, but must not necessarily be considered a polygon. The third object type is a relation. Several

nodes, ways and/or other relations can be referenced by a relation object. This is done when

different objects are somehow related to each other. This is the case, for example, when a number

of ways define the route of a bus within a city. Semantic information is added to all objects by

defining a certain number of tags. A tag is a key-value-pair, separated by ‘=’. A tree, for example,

would have the tag ‘natural=tree’. Tags may have any combination of key and value, however, for

consistency reasons, the community agreed on a list of tags15

, which should be used for mapping.

This concept of adding semantic information to a geometry object has the advantage that new tags

can always be introduced, if required for special use-cases (cf. Ramm & Topf 2010, pp. 55-59).

12

http://komoot.de, checked on 15/07/2015 13

http://skobbler.de, checked on 15/07/2015 14

GPX is the data format, based on XML, for exchanging GPS-traces. 15

http://wiki.openstreetmap.org/wiki/Map_Features, checked on 15/07/2015

Page 24: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

16

Figure 7: OSM data model for map feature (left) and file system for GPX files (right) (adopted Ramm & Topf

2010, p. 56)

2.3.3 Incline Information in OpenStreetMap

The information about the incline can easily be added to ways as semantic information. According

to the list of proposed OSM map features as mentioned in 2.3.2, the key ‘incline’ should be used

when adding this information to a street or a path. The corresponding value of the tag is the actual

incline value, given in percent or degrees. Since, there are two possible units it must be indicated

with ° or %16

. Positive or negative values indicate if the way is inclined up- or downwards,

depending on the direction of the way. When the inclined part of a street does not cover the entire

street segment, the street should be split at the start and end of the inclined part. Furthermore, it is

recommended that the steepest incline along the path shall be added as a value. If the exact incline

value is unknown, but it is visible that the street is inclined, the value of the key ‘incline’ can also

be ‘up’ or ‘down’ (cf. OpenStreetMap Wiki 2015e)

value of key ‘incline’ share of OSM ‘highway’ features

with incline information

‘up’ 44.3 %

‘down’ 30.8 %

others 24.9 %

Table 2: Usage of the key 'incline' and its values17

16

Examples are: ‘incline=6%‘, ‘incline=8°’, ‘incline=up’, ‘incline=down’ 17

source: https://taginfo.openstreetmap.org, checked on 15/07/2015

Page 25: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

17

Out of over 83 billion (83,299,544) OSM features tagged with ‘highway’, only 0.2 % (169 121)

have information about the incline. This also includes also all paths, such as footpaths or bicycle

lanes. From Table 2 it can be seen that out of the 0.2 %, the main part (~ 75 %) has the value ‘up’

resp. ‘down’, giving only information that the path is inclined, but not to what extent. For the other

25 %, the incline is mainly more specifically defined in percent or degree, however, a few are also

described with words such as ‘moderate’ or ‘extremely steep’. To summarize, it can be said that

there is hardly any information present about the incline of paths in OSM, and if so, the infor-

mation is not very specific. This has several possible reasons. On the one hand, it may be difficult

to attract the contributor’s attention to a tag which will not be displayed on the map. On the other

hand, the incline cannot be digitized from GPS traces or aerial imagery, like other features. It

somehow needs to be measured or estimated with the use of special tools like measuring tape,

inclinometer or the smartphone with an in-built gyroscope. Measuring the incline is therefore time-

consuming and the contributors have to be on-site, since the incline cannot be mapped using simple

methods.

2.4 Data Mining

The aim of this thesis is to gain or extract information out of a vast amount of data. Such a process

is generally referred to as ‘Data Mining’. Next to data mining, the term knowledge discovery from

data is used in academic literature. While sometimes both terms are considered synonymous, data

mining can also be seen as one step in the process of knowledge discovery of data (KDD), as in

Fayyad et al. (1996) . He describes the steps of the process of KDD as data selection, prepro-

cessing, transformation, data mining and interpretation and evaluation of the results. Data mining in

this process chain of KDD refers to “[…] applying data analysis and discovery algorithms that

produce a particular enumeration of patterns […] over the data”. Therefore, KDD is an iterative

process in which any two steps can also involve iterations. According to Han & Kamber (2006)

many fields are considering the terms data mining and KDD as synonyms, probably because data

mining is much shorter. Hence, he defines data mining as follows:

“Data mining is the process of discovering interesting patterns and knowledge from large

amounts of data. The data sources can include databases, data warehouses, the web, other

information repositories, or data that are streamed into the system dynamically.”

Consequently, for this thesis both terms are used synonymously as the entire process of gaining

knowledge, including all steps as mentioned in Fayyad et al. (1996).

Page 26: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

2 Background

18

Data mining can be applied to any kind of data, such as information about books in a library, data

about customers (personal data or transactions), search engine queries or user-generated content of

different online communities such as Facebook, Instagram or OpenStreetMap (cf. section 2.2). The

field of data mining has grown out of the need to handle the data, after devices and methods were

developed to capture and store data of this amount. It comprises different methods and techniques,

such as detection of patterns and acquiring knowledge about the association and correlation of a

data collection. A collection of data is therefore always needed, since such information cannot be

obtained from a single record (cf. Han & Kamber 2006, pp. 5-7).

For spatial or geographic data special techniques and methods were developed and the field of

spatial data mining has emerged. Shekhar et al. (2004) argues that spatial data is not compatible

with regular data mining techniques, due to the complexity and intrinsic spatial relationships.

Spatial data mining uses techniques and methods from the field of spatial analysis as well as the

field of general data mining, as mentioned in the paragraph above. Mennis & Guo (2009) review

commonly used methods and techniques. The spatial classification can be divided into supervised

and unsupervised classification. Different objects are grouped into classes based on its properties.

Contrary to the unsupervised classification, which is also known as clustering, the supervised

classification needs a training dataset to detect the members of a group. An unsupervised method is

spatial clustering, where points are classified according to their spatial location. Spatial classifica-

tion methods generally consider neighboring objects, while this is not undertaken in general

classification methods. Another method, commonly used in spatial data mining, is the point pattern

analysis. It is also a clustering method and tries to extract areas in which an unusual amount of

events occur. An example is the detection of streets where accidents occur more often than on other

streets. This method is also known as Hot Spot Analysis. Further information on clustering

methods can be found Mennis & Guo (2009).

Page 27: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

19

3 Related Work

The chapter describes what applications are in need of incline information and which research was

done with regard to mining information out of user-generated GPS traces. Furthermore, research

related to the extraction of 3D information out of GPS data will be reviewed. For mining street

information out of GPS data it is essential to know, on which street the traces were recorded. This

process is formally known as Map Matching and, after a short review of different types of

algorithms, two of them are explained in more detail. At the end of this chapter, different methods

for smoothing time series measurements are reviewed. This will be an essential step in prepro-

cessing the GPS data.

3.1 3D Routing

There are several routing applications which rely on elevation information such as routing for sport

activities, wheelchair routing or energy-efficient routing for electric-powered vehicles (e.g. E-cars,

Pedelecs18

or electric wheelchairs). In the following section, different projects related to this topic

are presented. For projects, relying on VGI, a common problem is the lack of information regarding

the elevation or incline.

3.1.1 Wheelchair routing

Compared to navigation systems for cars, the routing for mobility-restricted people, such as

wheelchair users, elderly people with push chairs or temporarily impaired people, is more complex.

People belonging to one of these user groups may all have slightly different requirements for a

route. This highly depends on the individual disability and the type of assistive equipment

(pushchair, manual wheelchair, electric wheelchair). Ding et al. (2007) studied the requirements for

a wheelchair navigation system, through an empirical study with physically impaired people ans

their assistants. Among other attributes like condition of the sidewalk or information about stairs

and ramps, the street incline is of high relevance for wheelchair routing. Furthermore, Menkens et

al. (2011) performed investigation in the needs of wheelchair users, regarding a navigation system

which meets their requirements. In terms of incline, they found out that the maximum incline

which can be passed with a manual wheelchair is in general between 3% and 8% and for electric

wheelchairs up to 10%.

There are many investigations dealing with the development of routing algorithms meeting the

needs of mobility-restricted people (e.g. Müller et al. 2010; Neis & Zielstra 2014a). The main

problem that exists is the lack of data regarding sidewalk information, surface of sidewalk, curbs

18

Acronym for ‘Pedal Electric Cycle’. Pedelecs are bicycles with an assisting electric engine. The engine

supports the driver while pedaling up to a speed of 25 km/h (according to German Road Traffic Licensing

Act (StVZO)).

Page 28: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

20

and also inclines. Although the data was already acquired by governmental authorities or commer-

cial map providers, it is very costly. Therefore, most of the approaches rely on volunteered

geographic information, for example OpenStreetMap. In OpenStreetMap, that information is

theoretically freely available, but unfortunately hardly existing in the dataset. As stated in section

2.3.3 incline values are only available for 0.2 % of the street segments. Approximately 75 % of

them contain only the values ‘up’ and ‘down’. This only indicates if a street or path is inclined, but

does not specify the value. Due to the different requirements of the people, this is not sufficient and

a more accurate knowledge of incline is required.

Besides incline information, there is also a lack of other accessibility-related information in VGI.

Therefore, many routing services were developed, which allow the user collecting those infor-

mation (Kurihara et al. 2004; Menkens et al. 2011; Völkel & Weber 2008; Harriehausen-

Mühlbauer 2014). The idea is that the users gather information about barriers or obstacles on

sidewalks, while getting navigated. This incrementally improves the quality of the route calculation

and also ensures that temporary barriers are acquired. The incline of streets needs to be measured

and can consequently not be determined with those systems. This shows the demand for an

alternative way to determine incline values of a street network.

3.1.2 Energy-efficient routing

Electric powered vehicles, such as E-cars, Pedelecs or e-wheelchairs are getting more and more

popular, although according to Bachofer (2011) people are still skeptical. This can be explained

with high costs, long time for charging the battery or shorter distance range. In addition, a reason

may also be the poor prediction of distance range. Depending on the properties of the street, the

power consumption may vary. The surface material as well as the incline of the street decreases the

battery service life. Depending on the speed, the energy demand increases with 50% to 100% on an

incline of 4% (Bachofer 2011). Although the travel distance is longer, it might be of benefit if the

user takes a route around a hill or avoids streets with bad surface. Consequently, the knowledge of

the incline is an important factor to estimate the distance range per battery life. With a routing

service that considers the energy consumption of a street segment, the energy demand of a route

can be determined. This allows us the possibility to choose the most energy-efficient route or at

least gives a prediction of the battery’s distance range. This research field is known as EcoRouting

or Green Navigation (Bachofer 2011). Although, it is not that relevant for fuel powered vehicles,

since they have a bigger distance range and can use a denser network of gas stations, with

EcoRouting the fuel consumption and therewith the carbon dioxide emission can be reduced.

Franke et al. (2012) developed an algorithm for energy-efficient routing of electrically powered

vehicles. The resulting navigation system is called eNav. Firstly, it calculates the power consump-

tion for each edge of the routing network, using the length of the edge and incline information.

Page 29: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

21

Secondly, edges which cannot be passed by wheelchairs (e.g. because of steps) are rejected and

accessibility information about edges and Points of Interest (POIs) are requested from other

platforms like rollstuhlrouting.de19

or Wheelmap20

need to be requested to get. In the third step,

surface information is included in the routing algorithm. OpenStreetMap has been taken as data

source for the street network and for information about the street surface. For the calculation of the

incline the authors used airborne laser scanning data, which is of high accuracy but also very

expensive.

Sachenbacher et al. (2011) and Kono et al. (2008) also investigated the topic of energy-efficient

routing. The motivation of Sachenbacher et al. (2011) can be found in the field of electric mobility

and they developed an algorithm for energy-efficient routing using OpenStreetMap street data and

the SRTM DEM with a horizontal resolution of 90 m. Contrary to the aforementioned investiga-

tions, Kono et al. (2008) tried to minimize the fuel consumption of conventional cars, by develop-

ing an eco-friendly routing algorithm. To do so, they consider traffic information, geographic

information and even vehicle parameters. As elevation data they use a DEM with a horizontal

resolution, provided by the Geospatial Information Authority of Japan (GSI).

A freely available alternative to GPS traces for the derivation of incline value is SRTM. Bachofer

(2011) analyzed the influence of the accuracy of DEM onto energy-related routing. He integrated

different DEMs into a routing system and found out that the accuracy of the DEM does influence

the modelled energy demand only to a minor degree. Consequently, he concluded that SRTM data

is sufficient for this use-case, however, it has also been discovered that for some routes the

modeled energy demand was more than 30% wrong.

3.2 Extraction of Street Attributes from user-generated Movement Trajec-

tories

Mining street information out of user-generated GPS traces has already been investigated by

several researchers. However, the focus was in deriving 2D information only and to the best of my

knowledge no literature was found, where the elevation of user-generated GPS traces was used to

derive 3D information. As shown in the following section 3.3, high-accuracy GPS measurement

techniques have been used to derive elevation related information.

Van Winden (2014) proposed algorithms to automatically derive different road attributes, like the

direction of the road (one or two way), speed limit or number of lanes. As GPS input data he used

GPS traces acquired from 800 people during a certain time span. Therefore, the transportation

19

http://rollstuhlrouting.de, checked on 15/07/2015 20

http://wheelmap.org, checked on 15/07/2015

Page 30: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

22

mode was known. The input data of the street network to be updated was taken from Open-

StreetMap. Like in typical data mining processes (cf. section 2.4), the data needed to be prepro-

cessed. The GPS traces had to be semantically linked to a street (map matching) on which the trace

was recorded. For this step the algorithm by Marchal et al. (2005) was used by requesting an

application programming interface (API).

Map matching was also an essential step in the research of Zhang et al. (2010). They used

GPS traces collected by the contributors of the OpenStreetMap project and aimed to derive street

attributes like the number of lanes and turning-restrictions. Furthermore, they used the traces to

automatically correct the street centerline from the street network when this is geometrically

incorrect. For this purpose, a map matching algorithm was implemented which is described in

detail in section 3.4.2. Additionally, they did an analysis of the coverage of the GPS traces, which

gives a first idea of what can be expected from this research. In their test area they discovered that

highways have 30 to 80 GPS traces whereas city roads have less than 20. Secondary roads in a

neighborhood have only a few or even none GPS traces. With a high redundancy, better results can

be achieved.

3.3 Derivation of 3D information, using high-accurate GPS measurements

In this section, research is present which involves the derivation of 3D information from GPS,

collected using high-accuracy GPS measurements. Due to the relative high accuracy, the redundan-

cy is not as crucial as in the work presented in section 3.2. To achieve a higher positioning

accuracy different methods have been used.

Boucher (2013) used SBAS-GPS receivers to estimate the height of a street network. SBAS21

is a

geostationary satellite augmentation system to support GPS. It sends correction data and improves

therefore the accuracy of GPS from 10m to 2m. The system which covers Europe is called

‘European Geostationary Navigation Overlay Service‘ (EGNOS)22

. The collected GPS traces were

fused with OSM street network data and the SRTM-3 DEM. The proposed method relies on GPS

measurements, acquired under good conditions. The roof of a car was equipped with two SBAS-

GPS antennas and the car was only driving roads with open environment. Therefore, error sources

which are common in crowdsourced GPS-trajectories like obstruction through buildings or

multipath effects are mainly eliminated in this data. To resolve the remaining error, the SRTM-3

DEM is used to correct the discrete height of the GPS-measurements. Matching the GPS traces and

the road network was done using a statistical method, which makes use of the Mahalanobis

distance. The 3D road network was derived by fusing the three data sources sequentially using

21

http://en.wikipedia.org/wiki/GNSS_augmentation, checked on 15/07/2015 22

http://www.essp-sas.eu/introducing_egnos, checked on 15/07/2015

Page 31: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

23

Kalman filter techniques. According to an experimental validation the road elevation estimation

could be improved using GPS trajectories in addition to the SRTM-3 DEM.

The following work achieved even higher accuracies, by using differential GPS with temporal base

stations (cf. section 2.1.1). Han & Rizos (1999) were motivated by the World Solar Challenge, a

special race for solar-powered cars across the Australian continent. The objective was to determine

the height profile of the road in order to optimize the race strategy. A car equipped with a differen-

tial GPS device drove from Darwin to Adelaide, convoyed by two cars acting as reference stations.

The road was divided in sections and for each section the reference stations were parked at the

beginning and the end. To derive the height information a spatial Kalman-filtering technique was

used to predict the incline information.

3.4 Map Matching

In the field of navigation it is important to know on which street the carrier of a GPS device is

traveling. Furthermore, the position on that street segment is of importance. To solve this problem,

the recorded trajectory data of the moving object and the segments of the street network data need

to be semantically linked (cf. Figure 8). These algorithms are in the literature referred to as Map

Matching (e.g. Quddus et al. 2007; Marchal et al. 2005). One may think that this is a straightfor-

ward task, but due to inaccuracies of both input data sources it is more complicated. Especially in

regions where the GPS signal is generally of low quality (e.g. urban areas, forest) and a dense street

network exists, the quality of map matching may vary. Furthermore, errors and inaccuracies in the

street network data may cause a wrong match of the trajectory data and the street network.

Figure 8: Map Matching. The GPS trace (blue) is snapped to the street network (red) (Map: OSM)

Page 32: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

24

3.4.1 Categorization of Map Matching Algorithms

Quddus et al. (2007) reviewed different map matching algorithms. They categorized the algorithms

in four different groups of approaches. The first group is about geometric approaches. They

exclusively rely on the geometry of the trajectory and street network data and do not consider the

topology. This means that the connectivity of street segments is not used in the matching process.

Geometric algorithms take the geometry of either the single point positions or the trajectory as

curve and search the closest node within the street network or the closest curve. Consequently, the

approaches are called point-to-point, point-to-curve or curve-to-curve matching. The geometric

approaches are generally faster in the processing and easy to implement. Secondly, there is the

group of topological approaches. In addition to the geometry of trajectory and street network data

they make use of the relationship between the segments of the street network. Two street segments

may for example be connected or disjoint. For the presented algorithms, the topology of the street

network was analyzed in advance. The third group is the group of the probabilistic map-matching

approaches. For those approaches, the error of the GPS measurement is taken into account in the

form of an error ellipse. Using the error ellipse it is searched for intersecting street segments, which

are considered as matching candidates. In case, there is more than one candidate, properties like

speed or direction of the trajectory are used to detect the correct street segment. Advanced map

matching algorithms represent the fourth group. Algorithms are described which uses more

advanced techniques, such as Kalman filtering or other mathematical models. For all algorithms the

assumption is made, the GPS trace was recorded while travelling along a street, rather through

areas where no street can be found.

In the following section three Map Matching algorithms are described briefly. The first two (

Marchal et al. (2005) and Karussel (2014)) are already implemented and ready to use. Both

algorithms are designed for post-processing applications only and may potentially be used in this

research. The third algorithm proposed by Zhang et al. (2010) is not yet implemented, however, it

is easy to do.

3.4.2 Functionality of Selected Algorithms

The algorithm proposed by Marchal et al. (2005) is a topological algorithm, as it uses information

about the connectivity of the street segments. As already mentioned, this map matching algorithm

is implemented in the online service called Trackmatching23

. The service provides an API, which

can be requested with a set of GPS points as input. As a response, the user gets a set of IDs,

referencing the traveled street segments from OpenStreetMap. A disadvantage of this algorithm is

that it is limited to the street network and GPS traces cannot be matched to sidewalks or bicycle

lanes. This makes it unsuitable for applications, where the transportation mode can also be cycling

23

https://mapmatching.3scale.net/, checked on 15/07/2015

Page 33: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

25

or walking. In short, the algorithm works as follows: The incoming GPS points are processed

sequentially and matched according to their distance to a street segment, which is connected with

the previous matched segment. The first step is the initialization process. Starting with the first

GPS point, the three closest street segments are searched by calculating the Euclidean distance.

Using these segments, new candidate paths are created. Each candidate path has a score (weight),

which is the sum of the distances between the GPS points and the segment. The path with the

smallest score and the smallest cumulative distance is considered as the traveled route. If the end of

a street segment is reached, the algorithm searches for street segments, which touch the end node.

All touching segments are now considered as matching candidates. For all candidates the cumula-

tive distance to the GPS points is calculated. Again, the segments with the lowest score is consid-

ered as the traveled path.

Another approach is the algorithm by Karussel (2014). According to the categorization by Quddus

et al. (2007) it may be categorized in the group of advanced algorithms, as it uses a routing engine

to estimate a path, rather than using the street network as input. The algorithm is implemented in

Java and is published24

under the Apache License 2.0 and can therefore be used freely. The

algorithm is uses the routing engine Graphhopper25

, a route planner which is based on OSM.

Firstly, for each GPS point the three closest street segments are searched and weighted. The weight

of an edge is the shortest distance to the GPS point. Once each GPS point has three weighted street

segments, the routing engine is requested to find the best path along all the selected street seg-

ments. The best path is the one, where the sum of all weights is the smallest. This makes this

algorithm unsuitable for real-time applications. The advantage of this approach is that a realistic

path is found, even if the GPS trace is interrupted (e.g. in tunnels or in dense forests). However, to

calculate a path, Graphhopper needs to know whether the track was recorded while walking,

cycling or driving a car. This is a disadvantage for applications where the transportation mode is

not known. Another disadvantage is that the results are only routes, computed by the route planner.

This probably leads to mismatches when a street was taken although it is not allowed (e.g. a

pedestrian walking in opposite direction on a one way street).

The third algorithm is proposed by Zhang et al. (2010). It is mainly a geometric algorithm, but also

uses a clustering method, therefore, it may also be categorized as an advanced method. Like the

aforementioned two algorithms, the OSM street network is used as input data. Within the algo-

rithm, three conditions are checked using the street segment and the GPS traces: distance, direction

and angle. Note that this method uses the GPS traces as curves, rather than processing the GPS

points sequentially. First of all, profile lines perpendicular to the street are created with a specific

distance to each other. The length of the lines is 30 m which have been found to be a reasonable

24

https://github.com/graphhopper/map-matching, checked on 15/07/2015 25

https://graphhopper.com/, checked on 15/07/2015

Page 34: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

26

value considering the error of the GPS traces and the width of the street. All traces which intersect

the perpendicular lines are firstly seen as matching candidates. In case of a one-way road, a

candidate is removed from the list if the GPS trace has the opposite direction than the one-way

road. By convention, the direction of one way roads in OSM is specified through the digitization

direction. The third condition is the angle between the trace and the road. If this angle is greater

than 20 degrees, the GPS trace will also be removed from the candidate’s list. The three conditions

will select the corresponding traces, but it can still yield mismatches (false positives), if two two-

way roads are parallel and too close to each other. This if often the case, where there are street-

accompanying bicycle lanes and sidewalks. In this case, the GPS traces are matched using a

clustering method.

3.5 Smoothing of Time Series Measurements

The input data for this research are GPS traces collected by user with low-cost GPS devices. As

stated in section 2.1, a certain noise is expected in the data. According to Haining (2003), smooth-

ing algorithms can be used to remove the noise and improve the accuracy of the derived infor-

mation. In his book, he reviews different smoothing methods and techniques. Non-linear smoothing

methods such as median smoothing and linear smoothers like mean smoothers are mentioned.

Linear smoothers shall be chosen if there are no abrupt changes expected in the data. This is due to

the fact that peaks or small-scale features will be removed by averaging values instead of taking the

median. The elevation profile of a street on which the GPS traces have been recorded usually

follows a continuous line. Consequently, the elevation profile of the GPS measurements should

theoretically not contain any discontinuous measurements and if so, they can be considered as

outliers and shall be flattened.

For linear or non-linear smoother, a window of certain size is fixed with its center on each data

point of the series. The data point on which the window is fixed will be assigned with a smoothed

value, considering all neighboring data points within the specified window. Out of the data points,

a weighted average or the median can be used as the new value. When determining the weighted

average, the weights can be selected with regard to the distance between the data points, however,

they are normalized to one (Haining 2003, p. 231). Points further away consequently influence the

smoothed value less than closer points. The size of the windows and the number of data points,

which influence the result, should be chosen depending on the desired degree of smoothing. A

bigger windows size increases the information used of points, which are further away. This leads to

a higher precision, although it can also lead to biases. A smaller window size decreases the risk of

introducing a bias, but the precision is lower, because a smaller sample and consequently less

information are used. (cf. Haining 2003, p. 229)

Page 35: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

3 Related Work

27

In the following, Tukey's (1977) ‘Median of 3’ algorithm is explained in more detail. Although it is

a non-linear smoothing algorithm and thus not likely to be used in this research, the functionality is

very similar to a weighted moving average smoother.

Tukey (1977, pp. 210-213) presents an algorithm for smoothing equidistant sequences of numbers,

namely ‘Medians of 3’. To start with the smoothing, the second number as well as the previous and

following number are selected. It has to be started with the second number, since the first does not

have a left-hand side neighbor. This yields to no value for the first and respectively last spot of the

sequence. From the selected three numbers, the second will get a new (smoothed) value assigned

considering the two neighbors. After ordering the selected numbers, the median can be determined

easily. The determined median is then assigned to the second value. This process is then repeated

with all values of the sequence. It is important that the neighbor on the left-hand side is selected

from the raw data, rather than from the smoothed data. The smoothing can then be applied to the

smoothed sequence again, to achieve a higher degree of smoothness. Figure 9 shows a raw

sequence of numbers (row 1) as well as a single (row 2) and repeated (row 3) smoothed data row.

Figure 9: Example of 'Median of 3' -smoothing with the raw data (row 1) and the results using the single median

smoothing and the repeated meadian smoothing. (Tukey 1977, p. 212)

This approach problematic as the sequence gets shorter because of the missing values at the end.

To solve this problem Tukey (1977) proposed two approaches. The first and simplest one is to just

copy the first value from the raw data to the smoothed values. The second and more complicated

one is the so called ‘end-value smoothing’. The last and missing value is the median of the

following values derived only from the first three values of the sequence. The corresponding

number from the above mentioned example in Figure 9 is shown in parenthesis.

Value 1: the actual raw value (13)

Value 2: second number from the first smooth (9)

Value 3: The second value of the first smooth (9) plus two time the difference of the second and

the third value of the first smooth (9,7): 9 + 2* (9-7) =13

Consequently, the first value of the smoothed sequence will be 13. An illustration for better

understanding is shown in Tukey (1977, p. 222).

13 7 9 3 4 11 12 1304 10 15 12 13 17 20 24

- 9 7 4 4 11 12 12 15 12 13 13 17 20 -

- - 7 4 4 11 12 12 12 13 13 13 17 - -

Page 36: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

28

4 Methodology

The Methodology describes all steps which are necessary for the derivation of incline values of the

street network. First of all, the pilot region in which the approach is tested will be defined.

Secondly, the used tools and data will be described in detail. It follows a detailed description of the

workflow and implementation, including data import, preprocessing, map matching and the

calculation of the incline. In the last section of this this chapter it is described, how the derived

information can be validated.

4.1 Definition of Pilot Region

The pilot region for this research is the region around Heidelberg in the south-west of Germany. In

Figure 10 the extent of the region it indicated by the red bounding box. The region was chosen

since is it also one of the pilot regions for the project CAP4Access. Projected, the area is almost a

square with a side length of approximately 22 km. This results in an area of approximately

497 km². The area is bounded by the localities Leutershausen in the north, Neckarsteinach in the

east, Nussloch in the south and in the west it reaches almost to Mannheim. The region is character-

ized by mountainous and forested areas in the east as well as flat urban areas and farmland in the

west. This is particularly suited for this research, since it makes it possible to differentiate the

results between land use classes and other characteristics.

Figure 10: Pilot region Heidelberg / Germany. (Map: OSM)

Page 37: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

29

4.2 Tools

In order to work on this topic and implement the steps of the workflow, a range of tools were used.

As programming languages Java has been used to implement the main part of the software. The

extensive number of spatial algorithms implemented within the Java libraries GeoTools26

and Java

Topology Suite (JTS)27

made it convenient and efficient to work with geometries. Both libraries

are licensed under the Open-Source license LGPL and can easily be integrated in the project using

Maven dependency management28

. Maven is a tool which helps to build java projects and manages

the integrated libraries. As an integrated development environment (IDE), Eclipse has been used.

An IDE supports programmers with a source code editor, syntax highlighting, recommendations

and a useful debugging tool.

To store the input data as well as the results, a PostgreSQL (PGSQL) database has been used. In

order to work with spatial data the extension PostGIS has been added to the installation of PGSQL.

It also provides many functions which implement geometric algorithms, however, the processing

was mainly done in Java.

For visualization of intermediate results and some preprocessing tasks the GIS-tools ArcGIS and

one of its Open-Source alternatives QGIS has been used. QGIS was used in addition to ArcGIS,

since connecting to the database and loading the data is very easy and intuitively.

4.3 Data

The main two data sources for this research are the crowdsourced GPS traces and the street

network. Both data sources are described in detail in the following sections. Furthermore, data is

described, such as land use classes and digital elevation models (DEMs), which is not needed for

the calculation of the incline, but used for the evaluation of the result.

4.3.1 Crowdsourced GPS traces

Crowdsourced, or user-generated GPS traces, are one of the data sources for the determination of

street incline. Firstly, this section gives an overview of different platforms, projects or application

in which GPS traces are crowdsourced by volunteers. Secondly, the format for exchanging

GPS traces, GPX, is introduced. After that, it is described how GPS traces are handled within the

OpenStreetMap project, since the GPS traces of OSM will be used for this research. At the end,

typical errors within the data are discussed briefly.

26

http://www.geotools.org, checked on 15/07/2015 27

http://www.vividsolutions.com/jts/JTSHome.htm/, checked on 2015/05/22 28

http://maven.apache.org/, checked on 22/05/2015

Page 38: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

30

4.3.1.1 Platforms and Devices

There are several platforms and applications in which GPS traces are collected for different

purposes. One example are sport-tracking apps for smartphones, such as Strava29

, Runtastic30

or

Runkeeper31

which track the user’s way while training, to provide statistics about the activity such

as distance, average speed, total climb or the elevation profile. Other examples are platforms such

as gpsies.com32

which purpose is to exchange and recommend traveled routes for outdoor

activities. The collection of GPS traces within the OpenStreetMap project has the purpose of

supporting the map making.

The devices which are usually used to record GPS traces have integrated low-cost GPS receivers

(Heipke 2010), such as smartphones or handheld GPS devices. Depending on the device or

smartphone-app used, the elevation information of the track points may originate from a different

source than GPS. Some devices, especially handheld GPS devices, have built-in barometers, which

determine the elevation by measuring the change of air pressure. This can lead to high systematic

errors, if the barometer is not calibrated properly. Another source for crowdsourced GPS traces are

elevation databases. The sport-tracking services Runkeeper and Strava replace the measured

elevation by the GPS receiver with values from an elevation database. Due to the poor vertical

accuracy of GPS (cf. section 2.1.2), a plotted elevation graph or the calculated total climb may be

wrong. While Strava uses an elevation database without mentioning the source of elevation

information, Runkeeper uses the third-party service topocoding.com, which is based on elevation

information from SRTM33,34

. However, both services so not specify, if the measured elevation is

only replaced for calculation and visualization or if the exported GPX files also contain the

elevation values from the database rather than the original measurements. GPS traces, uploaded to

the OpenStreetMap project, might be recorded from the aforementioned apps, which results in the

problem, that the GPS traces may potentially contain elevation information, which is actually not

measured by GPS, but taken from other sources. Depending on the device or smartphone applica-

tion, the elevation must not necessarily reference the WGS 84 ellipsoid (cf. 2.1), but could also be

referenced to the mean sea level.

29

https://www.strava.com/, checked on 23/06/2015 30

https://www.runtastic.com/, checked on 23/06/2015 31

http://runkeeper.com/, checked on 23/06/2015 32

http://gpsies.com/, checked on 23/06/2015 33

https://strava.zendesk.com/entries/20965883-Elevation-for-Your-Activity, checked on 22/05/2015 34

https://support.runkeeper.com/hc/en-us/articles/201109736-How-does-RunKeeper-calculate-elevation-and-

climb- , checked on 22/05/2015

Page 39: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

31

4.3.1.2 The GPX Format

For the exchange of GPS traces from the device to one of the aforementioned platforms, the GPX

format is commonly used. GPX is the abbreviation of ‘GPS Exchange Format’ and is an XML-

based format. Figure 11 shows an example GPX file, which is an instance of the GPX schema of

version 1.135

. The root element ‘gpx’ contains information about the version and the schema

location as attributes. As child elements there are waypoints (‘wpt’) and tracks (‘trk’). Waypoints

are points which have been stored separately in order to mark locations, such as point of interests.

The track element contains the actual GPS trajectory. Within this work it is referred to as GPS trace

or simply trace. A trace may contain several track segments (‘trkseg’) which again contain track

points (‘trkpt’). The latter must at least be described by the attributes longitude and latitude. The

coordinates are given in the geographic reference system WGS84. Additional optional information

can be stored as child elements, such as timestamp or elevation. (cf. Ramm & Topf 2010, p. 26)

Figure 11: Example GPX file.

35

http://www.topografix.com/gpx/1/1/gpx.xsd, checked on 23/06/2015

<?xml version="1.0" encoding="UTF-8"?>

<gpx xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-

instance" version="1.1" xsi:schemaLocation="http://www.topografix.com/GPX/1/1

http://www.topografix.com/GPX/1/1/gpx.xsd">

<wpt lat="49.4056396484375" lon="8.684947967529297">

<name>Castle</name>

</wpt>

<wpt lat="49.410247802734375" lon="8.692361831665039" />

<trk>

<name>example gps track</name>

<trkseg>

<trkpt lat="49.411144" lon="8.705768">

<ele>194.31606</ele>

<time>2015-05-29T16:18:32Z</time>

</trkpt>

<trkpt lat="49.41126" lon="8.70587">

<ele>250.8594</ele>

<time>2015-05-29T16:18:34Z</time>

</trkpt>

.

.

.

</trkseg>

</trk>

</gpx>

Page 40: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

32

4.3.1.3 OpenStreetMap GPS traces

The main input data for this research are the GPS traces collected by OSM contributors. The so-

called gpx-planet file36

contains all original GPS traces as GPX files from all over the world. The

latest version of this file is dated from April 2013, however, the script to create the dump is online

available37

. It needs to be applied to the OpenStreetMap database, to which only OSM administra-

tors have access. Therefore, a new dump cannot be created and the one from April 2013 has to be

used.

The data was collected by thousands of users and contains more than 2.5 billion track points. As

shown in Figure 12 , which depicts the amount of GPS track points per grid cell, the majority of

points can be found in Europe. Especially in Germany, Austria and Switzerland a higher density

than in other European countries, like Spain, can be observed. This may be due to the higher

population density or a higher motivation in collaborating in such projects.

Figure 12: Screenshot of grid map, shown the number of GPS points per grid cell.38

There are also regional extracts of the GPX-planet file available39

, which make processing of the

dataset more convenient, if one is only interested in a certain region. Besides the gpx-planet file as

data source, there is an Application Programming Interface (API)40

, which allows the user to access

the track points within a given region, using HTTP requests. On the website of OSM, there is a

36

http://planet.openstreetmap.org/gps/, checked on 22/05/2015 37

https://github.com/iandees/planet-gpx-dump/, checked on 23/06/2015 38

Screenshot taken from http://resultmaps.neis-one.org/osmgps.html, checked on 22/05/2015 39

http://zverik.osm.rambler.ru/gps/files/extracts/index.html, checked on 22/05/2015 40

http://wiki.openstreetmap.org/wiki/API_v0.6#GPS_traces, checked on 22/05/2015

Page 41: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

33

public list showing the uploaded traces. The upload of traces can be done manually using an online

form41

or the API and HTTP POST. In order to supply additional information about the trace, a

description must be given and a comma-separated list of keywords can be added. This makes it

possible to search and find traces by keywords. The uploaded traces must also have assigned a

visibility. Some users might not want to be linked to the uploaded traces, since one may draw

conclusions about the user’s location and movement profile. Table 3 shows the four options

‘identifiable’, ‘public’, ‘trackable’ and ‘private’ and their explanations.

Visibility Description

Identifiable - shown in the public traces list

- points with timestamp served over the API

- contained in planet-gpx file

- link to trace page via API

- Conclusion about the contributing user can be drawn.

- Access to raw GPX-file possible via trace page

Public - shown in the public trace list

- points with timestamp served over the API

- contained in planet-gpx file

- not linked to trace page via API

- access to raw GPX-file only via public trace list

Trackable - not shown in the public trace list

- points with timestamp served over the API

- contained in planet-gpx file

- not linked to the trace page

- no access to raw GPS-file

Private - not shown in the public trace list

- points without timestamp served over the API

- not contained in planet-gpx file

- not linked to the trace page

- no access to raw GPS-file

Table 3: Overview of visibility options for the upload of GPS traces42

The gpx-planet file has been imported and the public traces list has been requested to access the

uploaded traces after August 2013 (which are not contained in the dump). In total, 4194 GPS traces

from the gpx-planet file are within the pilot region. Out of it, 86% (3606) have elevation infor-

mation and can therefore be used for this research. With additional traces from the public trace list,

the number increases to 3842 traces. In total, there are over two million GPS track points in the

area of the pilot region (~497 km²). Assuming that the pilot region was a square with a side length

of 22,000 m and the points were evenly distributed, there is one GPS-point every 15 m x 15 m.

41

http://OpenStreetMap.org/traces, checked on 22/05/2015 42

source: http://wiki.openstreetmap.org/wiki/Visibility_of_GPS_traces, checked on 20/07/2015

Page 42: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

34

4.3.1.4 Typical Errors

As already mentioned in 2.1.2, GPS measurements suffer from multiple errors and inaccuracies.

Therefore, the elevation profile of a GPS trace always contains noise, meaning that neighboring

points on a flat terrain will often have different elevation values. Figure 13 shows the elevation

profile of a GPS trace on a fairly flat street. It can be seen that the elevation measurement always

increases and decreases within a range of ± 2 m. An analysis regarding the elevation accuracy of

crowdsourced GPS data is given in section 5.1.

Figure 13: Elevation profile of a GPS trace, recorded on a flat street.

Additional to the noise, it may also happen that there is a lack of GPS signal and the receiver loses

the position fix. Then, the next point of the trace is the point, when there is again signal to the

satellite. This may for example happen in tunnels or in other situation where no signal can be

received. This phenomenon results in traces, in which long distances between two adjacent points

can be found and which do not represent the course of the street. Figure 14 shows a few examples.

Figure 14: GPS traces with lost GPS-signals in tunnels. (Map: OSM)

Page 43: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

35

4.3.2 Street Network

The street network will be enhanced with the calculated incline values. For this research, potential-

ly every collection of street geometries may be used, however, OSM was chosen as the data source.

OpenStreetMap data is open source and its data model is commonly known in the domain of

volunteered geographic information. The streets are represented as LineStrings, or in terms of

OSM, ways. Rather than the outline, the geometry specifies the centerline of the street. The streets

are classified, using different tags. For the pilot area, 57.824 street elements with a total length of

around 5336 km were extracted. Table 4 shows the value used in combination with the key

‘highway’, which were used to extract the streets from the OpenStreetMap dataset. It furthermore

shows the share from the total length of each street type in percent. The street network is composed

out of different types of streets and paths. This includes ways which are dedicated to cars,

pedestrians, cyclists or a combination of the aforementioned. For convenience and to avoid

confusion it has to be noted, that the individual parts of the street network will in this thesis be

referred to as ‘street’, although it includes also paths, which cannot be used by car.

value Description43

share in %

track agricultural, forestry streets 43.84

residential streets within residential areas 18.38

path

mainly hiking trails and small

paths 9.56

footway for pedestrians only 8.13

secondary country road of second priority 4.34

tertiary country road of third priority 2.84

cycleway for cyclists only 2.62

living_street

streets, where pedestrians have

priority over cars 2.01

motorway Equivalent to autobahn 1.98

unclassified

roads with minor priority than

tertiary 1.91

primary

country road with highest

priority 1.33

others44

3.06

Table 4: Values of highway tag and their share of length in percent.

The relatively high share of streets with the tag ‘highway=track’ can be explained with the high

occurrence of forest and fields in the pilot area. The length of footways and bicycle lanes is

compared to the residential road relatively small, although residential streets often have adjacent

footways. Instead of mapping footways as an individual way, the information can also be added as

tag to the street geometry. The same holds true for bicycle lanes. As already mentioned in section

43

http://wiki.openstreetmap.org/wiki/Map_Features#Highway, checked on 27/05/2015 44

trunk, motorway_link, pedestrian, trunk_link, secondary_link, primary_link, road, tertiary_link

Page 44: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

36

2.3.1, it is obvious that streets important for pedestrians or wheelchair users have a large share.

This is another argument for using OpenStreetMap data instead of commercial street network data.

4.3.3 Land Use Information

Information regarding the land use is used in order to classify the results in chapter 5. Also for this

reason, the data of OpenStreetMap was taken to extract polygons with certain land uses. There is an

extensive list of tags in the OpenStreetMap Wiki (2015b) which indicates different land use

classes, such as ‘residential’, ‘forest’ or fields. Since different land uses usually have different

characteristics in terms of visibility of satellites and obstruction through man-made structure, it the

results are expected to be dependent on the land use. There are many tags which describe areas

with different land uses, but only the seven most common tags have been extracted. Table 5 shows

the different land uses and their characteristics. Rural areas like farmlands and allotments are

characterized by fields and mainly low buildings. This means there are almost no structures which

may influence the GPS signal through multipath effects or shadowing. In OpenStreetMap, there are

two tags for farmland used (‘landuse=farm’ and ‘landuse=farmland’). According to the wiki, both

tags mean the same, although farmland should be preferred over farm. Therefore, both term will be

combined, and termed as ‘farmland’. Urban areas such as commercial, industrial or residential

areas are characterized by taller buildings and urban canyons (Langley 1999) where multipath and

shadowing effects are more likely. Through the quick change of reflected signals and from

shadowing to open view (at cross roads), a heterogeneous GPS quality can be expected. In forested

areas which are mainly covered by a dense tree canopy, it is very likely that the GPS signal is

shadowed. In these areas a homogenous degraded GPS quality can be expected.

Page 45: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

37

value of key ‘landuse’ Characteristics

allotments - Gardening

- Small buildings

commercial - Office buildings

farm

- Land used for farming

- No buildings, no trees

- Tillage and pasture

farmland - Same as ‘farm’

- Should be used instead of farm

forest - High trees

- Dense canopy

industrial

- Factories or warehouses

- Wide streets for trucks and delivery

vehicles

- Less obstruction then residential

residential

- Urban environment

- Tall buildings

- Narrow streets, trees

Table 5: OSM landuse-tags and their characteristics.

4.3.4 Digital Elevation Models

Within this research, digital elevation models (DEMs) are used during the evaluation of the results.

A DEM is a representation of the earth’s surface and may be acquired with different methods such

as terrestrial surveying or remote sensing techniques like stereo photogrammetry, radar systems or

LiDAR (Laser Detection and Ranging). Only with airborne or spaceborne remote sensing

techniques it is possible to acquire data for a larger region in a reasonable time. Out of the

measurements a DEM can be generated, using certain data analysis techniques such as interpola-

tion. ‘Digital elevation model’ is a general term which is used for both specifications: ‘digital

surface model’ (DSM) and ‘digital terrain model’ (DTM). There are different definitions of the

terms (Zhilin et al. 2005; Cartwright et al. 2007), however, the terms will in this research be used as

follows. As shown in Figure 15, DSMs and DTMs can be differentiated by what structures are

included in the model. While a DSM is the representation of the earth’s surface including all

objects like trees and buildings on it, in a DTM those objects are excluded. Using remote sensing

techniques, the surface is measured (e.g. top of building, top of tree canopy) and consequently, a

DTM needs to be corrected in order to exclude objects which are located on the terrain.

Page 46: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

38

Figure 15: Difference of DSM and DTM.45

DEMs can also be classified in terms of horizontal resolution. Czegka et al. (2004) classify DEMs

in high-resolution DEMs with a horizontal resolution with a cell size smaller than 10 m, medium-

resolutions DEM with a cell size of 30 m to 100 m and low-resolution DEMs with cell sizes greater

than 500 m.

For the validation within this research, a high-resolution DEM will be used as reference data in

order to compare the derived incline values. Furthermore, it will be made use of it for the error

analysis of the GPS points. The DEM was computed from LiDAR measurements by ‘Landesamt

für Geoinformation und Landentwicklung Baden-Württemberg’ between the years 2000 and 2005.

It represents the terrain, excluding building and trees and is consequently a DTM. Areas, which

couldn’t be measured, like building or very dense forests, were interpolated using neighboring

points. The available data covers the entire pilot region with a horizontal resolution of 1 m and a

vertical accuracy of 0.5 m. The DTM was given as a point txt-file, containing evenly distributed

XYZ-coordinates. With the help of the function ‘Point to Raster’ of ArcGIS, the DTM has been

converted to a georeferenced raster file. The coordinates were given in Gauss-Krüger coordinate

system. The elevation values are heights above the sea level and are therewith referencing a

quasigeoid. Contrary to a geoid, a quasigeoid is not an equipotential surface, however, it deviates

from the geoid in a mm or cm range (cf. Torge 2001, p. 82). Since the GPS-points as well as the

street network data is given in the World Geodetic System 1984 (WGS84), the horizontal datum of

the DTM is transformed to WGS 84. This enables the evaluation of the GPS accuracy.

Next to the high-resolution DTM, the SRTM-1 DEM shall be used to compare the calculated

inclines with data which is, like crowdsourced GPS-points, free and globally available. The SRTM

DEM is acquired from the satellite mission SRTM (Shuttle Radar Topography Mission). To be

more accurate the SRTM DEM is a DSM, since objects on the earth’s surface are not reduced. The

45

Adopted from: http://www.gsi.go.jp/WNEW/TEC-NEWS/2007-tec172.html, checked on 22/05/2015

DTM

DSM

Page 47: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

39

SRTM mission is a joint project from NASA, the National Geospatial-Intelligence Agency and the

German as well as the Italian space agencies. The DEM is available between 60° north latitude and

54° south latitude and covers therewith 80% of the land on the earth. The data is available in the

datum WGS84 as horizontal reference system and EGM96 as vertical datum. The elevation refers

therefore to the geoid, rather than to the ellipsoid. The absolute height error in Eurasia was

identified by Farr et al. (2007) as 6.2 m. Another free available DEM is the ASTER Global DEM.

It was compiled from data collected by the “Advanced Spaceborne Thermal Emission and

Reflection Radiometer”, mounted on the Terra spacecraft. It has, like SRTM, a horizontal

resolution of 30 m and covers the land between 83° north and 83° south latitude. Although, the

coverage is better than the one of SRTM, the vertical accuracy is worse with only 9.2 m (Meyer

2011). Therefore, SRTM, as the most accurate and almost globally available data source, has been

chosen to be used for the evaluation.

4.4 Workflow and Implementation

To get to the final result of having street segment enhanced with incline values, several steps need

to be performed, as depicted in Figure 16. After the import of both, the street network data and

GPX-files, the data set need to preprocessed in order to prepare it for the following steps. The

GPS traces must firstly be linked to the individual street segments. This step is known as Map-

Matching and detects GPS traces which were recorded on the street segments. After that, the

incline of the street segment can sequentially be calculated, with the use of the assigned

GPS traces. Then, the calculated incline values are compared with the incline values calculated

from the LiDAR DTM and the SRTM DSM. In the following sections, all individual steps will be

described in more detail.

Figure 16: Process of deriving incline information out of user-contributed GPS traces.

Since, the tools which are developed for the purpose of this thesis may also be of interested for

other users and use-cases, they are planned to be published under an Open-Source license.

Therefore each step is implemented as individual tool and all intermediate results are saved in the

Page 48: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

40

database. Then the single steps can be individually used. The tools will be designed as generic as

possible, that they can be used with different data sources. The requirements on the data sources in

terms of modelling will be kept as small as possible. The latest version of each developed tool, can

be found on the CD, attached to this master’s thesis. Additionally, some of the tools are accessible

on the github account of the GISCIENCE Research Group of the University of Heidelberg46

.

4.4.1 Data Import

Before starting the actual process of calculating incline information, the data sources need to be

imported into a PostgreSQL/PostGIS database, running on a local machine. This enables the

storage of the file based GPS data and street network in a relational way and provides fast and easy

access from the Java program. The following subsections describe the process of importing the

GPS data and the OpenStreetMap dump, from which the street network and land use information

are extracted.

4.4.1.1 GPS traces

As mentioned in section 4.3.1, the GPX-planet file with the latest version from August 2013 is used

as the data source and extended with the traces from the August 2013 to now, taken from the public

trace list. The import is realized in Java and the source code is published on the Github account of

the GISCIENCE Research Group47

. The GPX-planet file is packed and compressed as an *.tar.xz

archive and contains on the one hand an XML-file with metadata and on the other hand all GPX-

files in a directory structure. The metadata-file includes information about the traces, like user, user

id, number of points, description and tags. The tar.xz archive does not need to be unpacked in

advance, since this is done on-the-fly using the combination of different Java classes, which handle

the unpacking and reading automatically. The first file to read and parse is the metadata file. All

entries are stored as single objects in a TreeMap, containing the GPX-id and its metadata object. In

a TreeMap the IDs are indexed, which allows faster access then using a conventional HashMap.

Once the metadata is stored in the memory, the GPX-files can be read sequentially. After reading a

file, it needs to be parsed to Java object classes, which is known as unmarshalling48

. The corre-

sponding object classes need to be generated in advance from the XML-schema of GPX version

1.049

.

Once the GPX-file is read it needs to be filtered. The workflow if the filter, which checks two

conditions, is depicted in Figure 17. First, it is checked, if the trace contains elevation information.

For this step it is only necessary to check a single track point, since either all or none track points

have elevation information. If a track point does not contain information regarding the elevation,

46

https://github.com/GIScience, checked on 20/07/2015 47

https://github.com/GIScience/osmgpxfilter, checked on 20/07/2015 48

http://en.wikipedia.org/wiki/Marshalling_%28computer_science%29, checked on 20/07/2015 49

http://www.topografix.com/gpx/1/0/gpx.xsd, checked on 20/07/2015

Page 49: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

41

the trace is skipped and not imported. Otherwise, it will be checked, if the trace falls into the

bounding box of the pilot region. Here, the condition is met when at least one track point is within

the region. Only if no track points intersect the bounding box, the trace will not be imported. The

trace, which meets both conditions, is, together with its metadata, written into a Post-

greSQL/PostGIS database. As mentioned in section 4.3.1, a GPX-file may contain several track-

elements. Hence, each track of a GPX-file is stored as single 3D LineStrings, together with the

GPX-id, track-id and metadata. The track-id is newly introduced and unique within the tracks of

one GPX-file. Figure 18 shows the columns and their datatypes of the relation ‘gpx_data_line’. The

columns ‘gpx_id’ and ‘trk_id’ build the primary key. After writing the trace to the database, the ID

of the trace is put into an ArrayList. This list will later be used, to identify if a trace was already

imported.

Figure 17: Filtering and import of GPS traces.

Figure 18: The schema of the relation 'gpx_data_line' for storing the GPS traces.

Page 50: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

42

Now, all files included in the GPX-planet file are imported. To add the traces which have been

uploaded to OSM after August 2013, the public trace list is scraped for the additional traces.

Scraping50

is the automatic reading of information from web sites. The public trace is split into

pages of 20 traces and each page can be requested with its page number (e.g.

https://www.openstreetmap.org/traces/page/1). The HTML-code of the trace list contains the links

to the actual trace page which again contains the GPX-id and a link to the GPX-file. If the GPS-id

is not contained by the ArrayList created during the import of the GPX-planet file, the file is

downloaded and unmarshalled. In comparison to the GPX-planet file, the files are in the original

version, which was uploaded by the user. Therefore, it can happen that the traces are stored as

different GPX versions (1.0 or 1.1). This is problematic during the unmarshalling process because

the version must be known. Consequently, the first step is to detect the GPX-version and then it

will be unmarshalled, using the correct GPX -schema. Since, all following steps use the object

classes, generated from the GPX version 1.0 schema, all traces from version 1.1 need to be

transformed. After the file was unmarshalled successfully, it follows the same procedure of

checking the file against elevation information and bounding box. After that, the tracks of the file

are written into the database.

As already mentioned, the developed tool is published. In order to make this tool usable also for

other use-cases, all the conditions, which are checked during the import are adaptable. The

arguments must be given when starting the program from the command line. In addition to the

export to PostgresQL, it also supports the output as ESRI shapefile or as database dump in the same

format as the input data. It can also be decided, if only the dump should be used in input, or if the

public trace list shall be scraped in addition. The following command has been used to run the

program. It specifies the path to the input file and defines the desired bounding box. The parameter

‘datasource’ with the value ‘both’ means, that the dump is imported and the OSM trace list is

requested. The options ‘clip’ and ‘elevation’ ensure that GPS-points outside the bounding box are

not written and only points with elevation information are imported to the database. At the end the

database parameters are given as well as the geometry format.

50

http://en.wikipedia.org/wiki/Web_scraping, checked on 20/07/2015

java -jar osmgpxfilter-0.1.jar \ -bbox top=49.459693 left=8.573179 bottom=49.352565 right=8.794050 \ --clip \ --elevation \ --datasource both \ --input C:/baden-wuerttemberg.tar.xz \ --write-pgsql db=osmgpx user=postgres password=XX host=localhost port=5432 \

geometry=linestring

Page 51: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

43

4.4.1.2 OSM Street Network and Land Use Information

The street network and the land use information are extracted from the OpenStreetMap planet file.

This entire planet file contains all map data of the world and has therewith a size of around 30 GB.

Since, only the data of the pilot region is necessary for this scope, the regional extract from the

German state Baden-Württemberg was downloaded51

. The import to the database was undertaken

using the command line Java program Osmosis52

. The tool reads the planet file and its regional

extracts sequentially and writes the data into relations corresponding to the OpenStreetMap data

model. It also provides filter capabilities based on the tags and bounding box. Consequently, only

ways containing either the tag ‘highway’ or the tag ‘landuse’ within the bounding box of the pilot

region were imported. Relations and nodes were rejected to keep the required space in the database

as small as possible and the time needed for the import as short as possible. The command to read,

filter and write the dump to the local database looks as follows:

Prior import, the database schema with the necessary relations must be created. SQL-scripts

containing the creation SQL statements are provided with the program files of Osmosis.

4.4.2 Preprocessing

This section describes the steps, which are done in order to prepare the data for the calculation of

incline. Preprocessing steps are applied to both the GPS traces as well as the street network.

4.4.2.1 GPS data

As described in section 4.3.1.4 the GPS data contains noise and other errors which may degrade the

quality of incline values. As a consequence, the traces are preprocessed to lower the impact of such

irregularities. The preprocessing is implemented in Java and has two steps. Firstly, traces are split

when the distance between two adjacent points or the change in elevation is exceeding a certain

threshold. Secondly, the split traces are smoothed in order to reduce the noise in the elevation

measurements. Figure 19 shows the workflow of the entire process of preprocessing.

51

http://download.geofabrik.de/europe/germany.html, checked on 20/07/2015 52

http://wiki.openstreetmap.org/wiki/Osmosis, checked on 20/07/2015

osmosis \

--read-pbf C:\baden-wuerttemberg-latest.osm.pbf \

--bounding-box top=49.5117 left=8.52791 bottom=49.311 right=8.83534 \

--tag-filter accept-ways highway=* \

--tag-filter accept-ways landuse=* \

--tag-filter reject-relations \

--tag-filter reject-nodes \

--write-pgsimp host="localhost" database="osm" user="steffen" password="xx"

Page 52: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

44

Figure 19: Flowchart of preprocessing the GPS traces

First of all, it is looped through the ordered list of all track points which compose the trace to detect

the point of the trace where it needs to be split. The distance d and the change in height h are

calculated between each point and the adjacent one. If d or h exceeds the given threshold, the

trace will be split into two parts at the detected break point. The first part, from the start of the trace

to the breakpoint, will be stored in a list. The second trace part may still contain errors, therefore

the splitting process starts again with the second trace. This goes on until the trace does not contain

any distances and changes in elevation, greater than the defined threshold. Since on trace, prior

uniquely identified with ‘gpx_id’ and ‘trk_id’, is split, a new ID, the ‘part_id’, is introduced. The

threshold values are chosen to be 300 m for the maximum distance as in Zhang et al. (2010) and for

the maximum h, 10m seems reasonable. As found out in section 5.1.1, the majority of h’s in a

flat area is below this, therefore 10 m can be considered as an error. These values are highly

experimental and changing them may lead to further improvements.

Once, a trace is split in parts, each part will be smoothed. Methods for smoothing time series were

reviewed in section 3.5. A linear smoother is preferred over a non-linear one, since it smooths also

abrupt changes in the elevation, which usually cannot be expected on streets. Consequently, a

weighted moving average algorithm has been implemented. The weights must be defined in

Page 53: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

45

advance and the number of weights must be odd. Note, that also the sum, of all weights must be

equal to one. Considering these condition, the weights can be individually defined by the user.

They are not depending on the distance between, the data points, since a GPS trace is usually

recorded using a certain distance and time interval. Thus, the time series is considered as equidis-

tant. The following set of weights W has been used:

𝑊 = {0.1, 0.125, 0.15, 0.25, 0.15, 0.125, 0.1}

Since the number of weights is seven, each data point will be smoothed, considering the two

preceding and the two following data points. The further away a point is, the less is its impact on

the new smoothed data point. The smoothed value of a point is calculated as follows:

ℎ𝑛∗ = (ℎ𝑛−3 ∗ 𝑤1) + (ℎ𝑛−2 ∗ 𝑤2) + (ℎ𝑛−1 ∗ 𝑤3) + (ℎ𝑛 ∗ 𝑤4) + (ℎ𝑛+1 ∗ 𝑤5) +

(ℎ𝑛+2 ∗ 𝑤6) + (ℎ𝑛+3 ∗ 𝑤7) ,

where

ℎ𝑛∗ = smoothed elevation

ℎ𝑛−𝑖 = original elevation of data points

𝑤𝑖 = weights

The drawback of this approach is, that it cannot be applied to the three end values of each side,

since there no adjacent points on each side. In order not to dismiss them, they are smoothed only

using the adjacent values which exist. For example the first data point will be smoothed using the

first three and the second data point using the first four values in the series. The same is applied to

the values at the end of the series. After smoothing the trace parts, there are written as 3D Lin-

eStrings into a new table in the database (Figure 20). The primary key contains the three id

columns, ‘gpx_id’, ‘trk_id’, ‘part_id’.

Figure 20: Columns of the relation, which stores the preprocessed GPS traces.

Page 54: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

46

4.4.2.2 Street Network

The incline will be calculated for each street segment of the OpenStreetMap street network within

the pilot region. The length of a street segment is often determined by the semantic properties,

instead of the geometry. For example a street segment is as long as the part of a street which has

one name. These segments may sometimes be quite long and span several valleys and peaks. This

is not of advantage for the calculation of incline, since an average incline value is calculated for

each segment. If one segment, for example, contains an incline going up and one incline going

down with the same magnitude, the calculated incline would be 0. To overcome this issue, the

street segments are split into smaller parts. The parts also should not be too short, since this also

decreases the number of GPS points, which can be used for the calculation. Therefore, the streets

are split at the intersection points with other street segments. This was done in QGIS, using the

function ‘Split lines with lines’ of the processing toolbox. Due to the splitting, the osm-id cannot be

used as a unique identifier, therefore, a new id is introduced. The relation in the database looks now

as in Figure 21.

Figure 21: Schema of relation 'streets'.

After the first step, the split street segments have been enhanced with land use information.

Therefore, the function ‘Join attributes by location’ of QGIS was used. In OpenStreetMap it may

happen, that the polygon of the land use does not cover the street. In order to add the land use

which is next to the street, a buffer of 20 meter was applied to the polygon in advance. Figure 22

shows the street network and land use polygons, which do not cover the street. The dashed line

shows the buffered land use polygon. The land use information is stored as a new tag with the key

‘landuse_incline’.

Page 55: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

47

Figure 22: Enhancement of street network with land use information in cases, where land use polygon does not

cover the street segment.

4.4.3 Map Matching

As stated in section 3.4, map matching is an essential step in mining street-attribute information out

of GPS traces. It is important to know, to which street the derived information can be referred. The

map matching approach for this research is implemented in Java and based on the algorithm of

Zhang et al. (2010). Figure 23 shows the workflow of the implemented algorithm.

Figure 23: Flowchart of map matching process.

Page 56: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

48

First of all, the street segments are requested from the database. They are processed sequentially

and for each segment the corresponding GPS traces will be determined. To find candidate traces

from the relation ‘gpx_data_line’ the geometry of the street segment is buffered with a buffer size

of 50 m. All GPS traces are selected as candidate traces which intersect the buffer of the street

geometry. Therefore, it is important that the buffer size is not chosen to be too small, since all

possible traces should be selected. Figure 24a depicts the street segment and its buffer in dark and

light green, and the GPS traces which intersect the buffer in orange.

Now, GPS traces must be dismissed, which were not recorded on the selected street segment. This

is often done by analyzing the distance and the direction between the street segment and GPS trace.

Here, no direction or distance is calculated. For each node of the street segment, a line is created,

that intersects the node and is perpendicular to the current edge of the street segment. For this step

the Java class GeodeticCalculator of the library GeoTools was used. It allows the calculation of

new coordinates with a given reference point and distance as well as direction from it. The length is

chosen to be 30 m like in Zhang et al. (2010), considering the horizontal error of GPS measure-

ments and the width of a street. Figure 24b shows the calculated profile lines in dark blue.

Once the profile lines are created, for each trace of the candidate traces it is checked, how many of

the profiles line are intersected. If a trace intersects most of the profile lines, it means that it follows

the direction of the street segment and is generally not further away from the centerline of the street

than 30 m. The minimum percentage (threshold) of intersected profile line is set to be 70 % for this

research. Lowering the threshold to 50 % would on the one hand give more matches but also more

false positives (not correctly matched), especially in case of street segments with only two nodes.

In this case it may happen that also traces are matched that only intersects with one profile line.

Then it cannot be assumed anymore, that the trace follows the direction of the street. On the other

hand, applying a threshold of 100 % would be too restrictive. Especially, when thinking about

street segment with many nodes. It may always happen that the trace exceeds the distance of 30 m

because of the GPS inaccuracy or the geometric error of the street segment. Therefore, the

threshold has been chosen to be 70 %. Figure 24c shows the GPS traces which intersect at least 3

out of 4 profile lines in red.

Page 57: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

49

Figure 24: The map matching process: Select candidate traces with buffer (light green) of street (dark green) (a),

create profile lines (blue) (b), select traces (red) which intersect at least 70 % of the profile lines.

If the percentage of intersected profile lines is greater than the threshold, the GPS trace is consid-

ered as being recorded on the street segment. The approach by Zhang et al. (2010) includes further

processing in case of parallel streets. If two street segments are parallel to each other it is checked

first, if one of the street segments is a one-way street (e.g. a highway, where each direction is

mapped individually). Then the direction of the traces is calculated and only matched with the

street segment if the direction of the street is similar. When there are parallel streets, which are not

one-way streets, a clustering algorithm is used to identify corresponding traces. This is not

necessary in the case of this research, since it doesn’t matter for the calculation of the incline on

which of the parallel streets the trace was recorded. It is assumed that streets which are parallel

have the same incline. Consequently, there is an ‘n to m’ relationship between the street segments

and GPS traces. One street may thus have multiple GPS traces and one GPS trace may be matched

to multiple street segments. One example of such a case is when footways next to the street are

mapped as individual LineStrings. With this assumption more traces can be matched to the street

segment if there are two parallel geometries. Examples are streets, represented with two geometries

for each direction or street-accompanying footways / bicycle lanes. However, this assumption may

also lead to error, when there are two parallel streets which do not have same incline, such as a

drive way to bridges as shown in Figure 25.

Figure 25: Example for two parallel street, which are do not have the same incline.

Page 58: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

50

After the traces of a street segment street have been found, the IDs referencing street segment and

GPS trace are written to the database. The relation called ‘streets_gpx’ references the relations

‘streets’ and ‘gpx_data_line’ via foreign keys. This is shown in the UML diagram in Figure 26.

Figure 26: The tables 'gpx_data_line', 'streets_gpx' and 'streets' and their relation to each other.

The tool is published and therefore provided to people who may need it. The program is kept as

generic as possible that it also works with other database schemas. The required database relations

and the columns as well as the input parameter mentioned above are specified in a properties file

(Figure 27).

Page 59: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

51

Figure 27: Properties file of map matching tool

#Properties file for map matching ####### street network ####### # name of street table in database t_streetName=streets # column with unique id for each street segment t_streetIdCol=id # column with osm_id (must not be unique, in case street segment were split in preprocessing) t_streetOsmIdCol=osm_id # column with osm tags, stored in hstore t_streetTags=tags # column with geometry. geometry type must be LineString and CRS must be WGS 84 t_streetGeomCol=the_geom ####### gpx input data ####### # name of gpx table in database t_gpxrawName=gpx_data_line # unique id for each GPS trace t_gpxrawIdCol=gpx_id # unique id for each GPS trace t_trkrawIdCol=trk_id # column with geometry. Geometry type must be MultiLineString and CRS must be WGS 84 t_gpxrawGeomCol=geom # default name for output table dbMatchingOutputTable=streets_gpx #buffer in meters (should be equal or bigger than streetProfileLength) streetBuffer=60 # length of profile lines which are fitted through the nodes of the street segments [m] streetProfileLength=30 #ratio of profile line of street which need to be intersected by GPS-trace in order to assume a match streetProfileIntersectionRatio=0.7

Page 60: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

52

4.4.4 Calculation of Incline

After the prior steps, the calculation of the incline of street segments can be done. Like in the map

matching process the street segments are also processed sequentially. The visualized workflow is

depicted in Figure 28.

Figure 28: Workflow for calculating the incline of street segments.

The first step is to request the street segments from the database. After that, the preprocessed

GPS traces (cf. section 4.4.2.1) must be requested from the database. Through the relation

‘streets_gpx’ and a join with the relation ‘gpx_data_line_preprocessed’ they can easily be reqested.

Usually, the GPS traces are span over more than one street segment. Thus, the traces need to be cut,

so that only the track points are selected, which are relevant for the calculation of the incline.

Relevant are those track points, which are near the street segment and not before or beyond it. To

achieve this, a buffer with a size of approximately 30m is calculated. With the buffer, the corre-

sponding GPS traces can be clipped. Consequently, only the part of the GPS traces are used, which

is within the buffer polygon of the street segment.

Page 61: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

53

Figure 29: Clipping of assigned GPS traces.

The process of clipping is performed on the database, since the Java library GeoTools does not

support clipping of 3D geometries. For the selection of the GPS traces and the clipping, following

SQL-statement is executed from the Java-Tool:

Besides the IDs which uniquely identify the GPS trace, the clipped geometry is returned by this

SQL-statement. It has to be noted, that the PostGIS function ST_INTERSECTION returns the

clipped geometries of the GPS trace parts as MultiLineStrings. This has to be considered later,

when the incline is calculated.

SELECT g.gpx_id,

g.trk_id,

g.part_id,

ST_ASGEOJSON(

ST_INTERSECTION(g.geom,ST_GeomFromEWKB('buffered_street_geom'))

) AS geom

FROM streets_gpx sg

LEFT JOIN gpx_data_line_preprocessed g ON

sg.gpx_id = g.gpx_id AND

sg.trk_id = g.trk_id

WHERE

sg.street_id = 'current_street_id'

Page 62: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

54

Once, all corresponding traces with clipped geometry are requested from the database, for each

trace the incline in percent is calculated. Since a GPS trace is a MultiLineString it needs to be split

into LineString objects. The incline, here denoted by m, is the weighted average of all calculated

inclines of the edges of the LineString. The weight is the horizontal length of an edge and it is

normalized with the full horizontal length of the LineString. Since the elevation is given in meters

and the coordinates of the GPS in geographic coordinates, the distance in meters is calculated

considering the earth as a sphere. The calculation of incline can be expressed using following

formula:

𝑚𝑡 = ∑ (ℎ𝑖 − ℎ𝑖+1

𝑑𝑖,𝑖+1) (

𝑑𝑖,𝑖+1

𝑙)

𝑛−1

𝑖=1

,

where

𝑚𝑡 = incline of GPS trace segment in percent

𝑛 = number of track points

ℎ = elevation of track point

𝑑 = horizontal distance between two track points

𝑙 = horizontal length of GPS trace segment

In OSM, the street segments are directed from the first node of the segment to the last one.

Consequently, it has to be checked whether the GPS trace was recorded in the same or opposite

direction of the street. This is done by a comparison of the average bearing of the street segment

and the GPS trace. If the difference in bearing is greater than a certain threshold, it is assumed that

the GPS trace follows the opposite direction and the calculated incline value should be inverted.

Individual samples have shown that due to the geometric accuracy of both, the GPS trace and street

segment, a threshold of 40° is reasonable.

Page 63: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

55

The next step is to combine the incline values of the individual traces to get a value which

represents the incline of the street segments. Since the length of the trace may vary, also here, the

weighted average based on the length of the traces was chosen. Following formula has been used

for averaging the incline values of the individual traces:

𝑚𝑠 = ∑(𝑚𝑎𝑡 ) (

𝑙𝑎

∑ 𝑙𝑎𝑘𝑎=0

)

𝑘

𝑎=0

,

where

𝑚𝑠 = incline of street segment in percent

𝑚𝑡 = incline of GPS trace in percent

𝑘 = number of corresponding traces

𝑙 = horizontal length of trace

After the calculation, the result is written into a new relation, called ‘streets_incline’. Besides the

street-id and the calculated incline, the number of traces which have been used for the calculation,

is added to the table. This supports the estimation of the accuracy, which is described in section 0.

4.5 Validation

In chapter 5, the results will be validated to estimate the achieved accuracy. This will be done by a

comparison with incline values of which have been derived using a high-accuracy DTM. Those

incline values could be calculated out of terrestrial measurements. To acquire test data for a

reasonable amount of streets would be very costly and time-consuming. Therefore, it was decided,

to use incline values calculated from a high accuracy DTM, acquired from LiDAR measurements

(cf. section 4.3.4). To avoid confusion, these incline values will in the following be referred to as

DTM incline, whereas the incline calculated from GPS traces will be noted as GPS incline. With

the use of the DTM, it is possible to calculate an incline value with a reasonable accuracy for all

street segments. Moreover, the results of the thesis shall be compared to incline values calculated

from the SRTM-1 DSM (SRTM incline). This will show how crowdsourced GPS traces perform in

comparison to other globally and freely available data.

The first step of calculating the incline from a DEM which is available in raster format is to densify

the street geometry in order to increase the number of node per street segment. The street segments

will get additional nodes, maximum every three meters. For each node of the densified street

geometry, the absolute elevation is taken from the DEM. The Java library GeoTools provides

classes and functions to easily load georeferenced raster files and to return the elevation for a

Page 64: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

4 Methodology

56

specific location from it. Once, each node has an elevation value the incline of the street segment

can be calculated. This is done by averaging the incline values of each edge of the street segment.

Contrary to the calculation of the GPS incline, a non-weighted average is calculated, since the

nodes of the street segment are equidistant. The DTM incline and the SRTM incline are then added

to the relation ‘streets_incline’, which was already created during the calculation of the

GPS incline.

For the calculation of the SRTM incline, the elevation of each node of the street segment is looked

up from the SRTM-1 DSM. The horizontal resolution of SRTM-1 is approximately 30 m. If a street

node is considered to be every 3 m, 10 points in a row will have the same elevation value from the

DSM. This would affect the quality of the SRTM incline. To avoid this, the SRTM-1 DSM was

resampled to a horizontal resolution of 1 m by 1 m, using the ArcGIS function ‘Resample’. This

results in interpolated elevation values for the each 1 m by 1m pixel.

Page 65: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

57

5 Discussion of Results

The implemented methods from the previous chapter were applied to the GPS and street network

data of the pilot region. The outcome will be discussed in the following sections. First of all, the

accuracy of the GPS raw data will be assessed. After that, the calculated GPS incline values will be

validated using a high-accuracy DTM and compared to the incline values derived from the

SRTM-1 DEM.

5.1 Analysis of Crowdsourced GPS traces

As already stated in section 0, a total number of 3842 GPS traces with over two million track points

with elevation information are used for this research. These 3D points, visualized in Figure 30, are

in this section analyzed on vertical accuracy as well as coverage and density.

Figure 30: Screenshot of visualized GPS track points, colorized according to their elevation. (green=low,

red=high)53

5.1.1 Vertical Absolute and Relative Accuracy

In order to be able to judge the accuracy of the data source for the calculation of incline values, the

measured GPS elevation is compared to the elevation from the LiDAR DTM. Firstly, the absolute

error of the GPS track points is calculated, whereas secondly the relative accuracy within the GPS

traces will be evaluated. The relative accuracy refers to the difference of elevation between two

adjacent points. It has to be noted, that the elevation coming from the LiDAR-DTM may also suffer

from inaccuracy. The DTM reflects the terrain of the earth, consequently, all structures on the

53

Screenshot taken from: http://cap4route.geog.uni-heidelberg.de/hd-osm-gps-webgl/hd-osm-gps-

webgl.html, checked on 20/07/2015

Page 66: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

58

earth’s surface, such as buildings, trees or bridges are not contained. The areas where no ground

elevation could be measured are interpolated. This leads to an error, when transferring the DTM

elevation to a GPS track point, which is located a bridge. Furthermore, the horizontal inaccuracy of

the GPS points influences the elevation values from the DTM. Considering that a GPS trace is

recorded on a street directly next to a very steep slope, the GPS track point may fall up to 10 m

next to street, depending on the horizontal error. As a consequence, a wrong elevation value is

transferred from the DTM.

5.1.1.1 Absolute Accuracy

For the assessment of the absolute accuracy the root-mean-square-error (RMSE) has been

calculated overall and for each land use class. The RMSE (the square root of the average of the

squared residuals) is commonly used in spatial analysis and provides a measure of the differences

between GPS and DTM elevations. The residual is the difference of the GPS measurement and the

reference value (DTM). The diagram in Figure 31 shows the RMSE in meters, overall and

differentiated by land use classes. For the calculation of the RMSE, only 90 % of the residuals have

been used while the other 10 % gave ben excluded as outliers. The outliers may come from traces

which were not recorded on the earth’s surface (e.g. from an airplane) or from wrongly calibrated

devices with barometric elevation measurement unit.

Figure 31: Vertical accuracy of crowdsourced GPS traces, distinguished by land use class.

Depending on the land use class, the error ranges from 21 m (farmland) to 35 m (allotments). The

overall RMSE is approximately in the middle with 27 m. According to Liu et al. (2014), the

vertical accuracy can be up to 2.5 times higher than the horizontal one. Considering a horizontal

accuracy of 6 to 10 m (cf. Zhang et al. 2010), the vertical accuracy can be assumed to be 15 m to

25 m. The RMSE of the data evaluated in this study, is only for some land use classes within this

range, however, overall the error is higher. several reasons may lead to this errors. User-generated

traces are likely not to be recorded under very good conditions. One may store the GPS device in a

car, in the pants pocket or in the backpack, while hiking. Under these conditions, an additional

35

34

30

27

25

22

22

21

0 5 10 15 20 25 30 35 40

allotments

commercial

grass

overall

residential

forest

industrial

farmland

RMSE [m]

Lan

d u

se

Page 67: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

59

error through due to may occur. Furthermore, the GPS data may contain traces, which have not

been recorded on the earth’s surface. Traces have been found, which were obviously recorded from

a flying object. An additional reason may be that the diversity of different devices may affect the

result. It can happen that some elevations originate from either barometric measurements or

elevation databases as stated in section 4.3.1. While barometers may be wrongly calibrated and

consequently introduce a systematic error, elevation databases may be referenced to a geoid instead

of the WGS 84 ellipsoid by transforming the ellipsoidal height from the GPS to geoidal height.

This would introduce a systematic error as well and affect the measure of the absolute vertical

accuracy.

The differentiation by land use has been undertaken to evaluate if there is a dependency on the

RMSE. Different land use classes have different characteristics with regard to the obstruction of the

GPS signal. Especially, in forested or residential areas a larger error would be expected, since a

dense tree canopy or high buildings obstruct the view from the GPS receiver to the satellite. In

addition multipath effects may occur when the GPS signal reflect on the façade or windows of

buildings. Contrary, areas of land use classes such as ‘farmland’, ‘allotments’ or ‘grass’ where

expected to have a smaller absolute error. In those areas one could generally find small houses, just

a few trees and wide streets. Consequently, there are not many structures which potentially

influence the GPS signal, therefore a wide and unobstructed view to the sky and the satellites is

possible. As it can be seen in the diagram in Figure 31 this assumption cannot completely be

confirmed in case of OSM GPS traces. The land use classes ‘allotments’ and ‘grass’, of which it

was expected to be less erroneous, showed one of the largest error, whereas GPS track points in

the class ‘forests’ have one of the smallest error. This is very surprising, but a reason may be the

sample size of the land use classes. ‘Forest’ is one of the land use classes with the most GPS track

points, whereas ‘allotments’ and ‘grass’ have less.

Figure 32 depicts a histogram of the differences between GPS and DTM elevations that have been

used to calculate the aforementioned RMSE. The vertical axis shows the total number of points

which fall into the bins represented on the horizontal axis. The width of the bins is two meters. It

can be seen, that data is mainly normally distributed around zero meters. Since, the elevation of the

DTM is referenced to the quasigeoid (cf. section 4.3.4) and the GPS elevation is usually referenced

to the WGS 84 ellipsoid, an offset equivalent to the geoid undulation was expected for most of the

points. The geoid undulation between the WGS 84 ellipsoid and the Earth Gravitational Model 96

(EGM96) in Heidelberg is approximately 48 m54

. Therefore, it can be assumed that most of the

GPS elevations are referenced to a geoid. Most likely, the elevation measurements are internally

54

Calculation done by online geoid calculator under: http://geographiclib.sourceforge.net/cgi-bin/GeoidEval,

checked on 20/07/2015

Page 68: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

60

transformed by the software of the device. In the histogram a second peak at around 48 m can be

found. This represents these GPS elevations which are referenced to the WGS 84 ellipsoid.

Figure 32: Histogram with the differences of GPS and DTM elevation

5.1.1.2 Relative Accuracy

With approximately 27 meters, the absolute accuracy appears to be very large, especially compared

to the vertical accuracy of the SRTM-1 DSM, which is 6.2 m (cf. section 4.3.4). Since, for the

calculation of incline only the difference in elevation of two adjacent track points is used, the

relative accuracy is evaluated in this section. For a GPS track point, the change of elevation hGPS

to the next track point of the trace has been calculated. This value reflects the actual incline of the

terrain including an error caused by the GPS measurement. Since the absolute error is not observed

here, the occurring errors can be considered as noise. To remove the influence of the terrain, the

high-accuracy DTM has been used to calculate the change in elevation hDTM

. The difference of

hGPS

and hDTM

indicates also the actual GPS error eh. Therefore, the following formula has been

used:

𝑒∆ℎ = ∆ℎ𝐺𝑃𝑆 − ∆ℎ𝐷𝑇𝑀

For the assessment of the relative accuracy, the preprocessed and smoothed GPS traces have been

used. eh has been calculated for approximately 1.7 million track points. The box-and-whisker plot

in Figure 33 shows the distribution of eh overall and differentiated between land use classes. The

bottom and top of the boxes represent the first and third quartile, whereas the ends of the whiskers

show the 5th

and the 95th percentile. This happened due to a few but very large outliers present in

the data set.

0

50

100

150

200-3

0

-26

-22

-18

-14

-10 -6 -2 2 6

10

14

18

22

26

30

34

38

42

46

50

54

58

62

66

70

74

78

Fre

qu

ency

tho

usa

nd

s

Differences GPS and DTM Elevation [m]

Page 69: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

61

Figure 33: Relative accuracy of crowdsourced GPS track points, overall and distinguished by land uses.

Overall, the error in the change of elevation is for 50 % of the data within a range of ± 0.16 m. The

whiskers go up to approximately ± 1.00 m. Contrary to the absolute error, it is here obvious that

land use classes which do not suffer from obstructions of buildings and trees perform better than

others. For areas with mainly grass and farmland, the range of the box is within approximately

± 0.12 m (grass) and ± 0.10 m (farmland). For 90 % of the data, eh falls into an interval of ± 0.7

(grass) and ± 0.5 m (farmland). For land use classes which suffer more from obstructions like

‘commercial’, ‘industrial’, ‘residential’ and ‘allotments’, 50 % of the points are influenced by an

error within a range from ± 0.13 m to ± 0.15 m. Although the first and third quartiles of the

aforementioned land uses are similar, the extents of the whiskers vary. While the whisker of the

land use classes ‘industrial’ is within a range of ± 0.6 m, the range increases to approximately

± 0.7 m for ‘commercial’ as well as for ‘allotments’ and even over ± 0.8 m for the land use

‘residential’. The land use class which has the largest extent of errors in this investigation is forest.

The error for 50 % of the data falls into the range of ± 0.26 m, which is more than twice as high as

for farmland. Also, the 5th and 95

th percentiles show a large deviation of eh, with approximately

± 1.5 m. This reflects the obstructions which are present in forest areas, due to the dense tree

canopy.

For the calculation of incline not only the difference in elevation is important, also the distance

between the two adjacent points matter. Therefore, it cannot be judged from the above examined

error eh, how much it influences the incline value. To get an idea of it, all values of eh were

aggregated by land use class and the RMSE was calculated with 90 % of the data. Furthermore, the

-2

-1,5

-1

-0,5

0

0,5

1

1,5

2e

h [

m]

Land use

Page 70: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

62

average distance between two adjacent points within the areas of the specific land use classes have

been used to calculate the effect of the error to the incline. In Table 6, it is shown that the calculat-

ed incline between two adjacent GPS track points contains an error of 2.4 %. Certainly, the result

depends on the land use class. The smallest error occurs in areas of the land use classes ‘grass’ and

‘forest’ with equal or less than 1 %. In forested areas an error of over 4 % can be expected. All

track points within the other areas, are in a range of 2.2 % to 2.7 %.

Land Use RMSE eh [m] Distance [m] ≙ incline [%]

overall 0.3 14 2.4%

allotments 0.2 11 2.2%

commercial 0.2 10 2.4%

farmland 0.2 17 1.0%

forest 0.5 11 4.3%

grass 0.2 29 0.8%

industrial 0.2 10 2.4%

residential 0.3 10 2.7%

Table 6: The effect of the relative accuracy on the calculated incline.

To summarize, it can be said that the relative accuracy depends on the land use class and on its

characteristics regarding obstructions through buildings, trees or other structures. The error sources

which are responsible for this result are mainly shadowing and multipath (cf. section 2.1.2). Other

error sources like ionospheric effects or clock inaccuracies are not affecting the results as much as

the shadowing and multipath. Furthermore, it can be seen from the box-whisker plot in Figure 33

that the error occurs more or less equally in positive and negative direction. Therefore, it may be

assumed that due to the large number of points which are used for the calculation of the incline for

one street segment, the noise will disappear by calculating the average.

5.1.2 Coverage and density

In this section the coverage and density of the OSM GPS traces are examined. This will give an

idea of how many streets data actually exists and how dense the GPS track points are. The coverage

is here investigated using the result of the map matching algorithm implemented for this research.

It has to be noted, that due to the n to m relationship, GPS traces may be matched to more than one

street.

Figure 34 shows a street map of the city of Heidelberg indicating the coverage of streets with

GPS traces. The street segments are colorized according to the number of traces, from green (few

traces) to red (many traces). Street segments visualized in blue do not have any matched GPS

traces. A large share of the street segments are covered with at least one trace, however, many

Page 71: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

63

streets have no matching GPS traces. Streets with over 20 corresponding traces are relatively rare.

Streets with a high traffic volume have the best coverage. There are the motorways (German

Autobahn) in the upper left corner, the streets on the two river sides of the river Neckar (upper right

corner) and the two primary streets in the center of the figure going in North-South direction.

Figure 34: Map, showing the coverage of the streets with GPS traces. (Map: OSM)

The coverage of GPS traces has now been investigated in more detail. It shall later be investigated,

if the number of traces has an effect on the accuracy. Therefore, it is interesting to know how many

corresponding traces can theoretically be used to calculate the incline. Figure 35 shows the share of

street length by street type with different numbers of assigned traces. The street types ‘motorway’,

‘primary’, ‘secondary’ and ‘cycleway’ have a coverage with at least one GPS trace of almost

100 %. Other street types such as ‘residential’, ‘path’ and ‘footway’ are covered with at least one

trace in 42 % to 61 % of the cases. This fact shows that the coverage with GPS traces depends on

the traffic volume and the street type. For example, the motorway, which is probably the street type

with the highest traffic volume, is completely (100%) covered with at least 25 traces, followed by

the street type ‘primary’ which is in the hierarchy of street types below ‘motorway’. The 100 % of

the primary streets have at least 5 GPS traces and it decreases to 70 % with at least 20 traces. With

30 traces, still 20 % of the primary streets are covered. From the streets of the types ‘secondary’

and ‘tertiary’ as well as ‘cycleway’ still around 95% are covered with at least 1 trace and at least 5

traces for approximately 80 % of the streets. The street types with the lowest coverage are

‘residential’, ‘footway’ and ‘path’. But still 42 % to approximately 60 % have at least 5 traces,

whereas this significantly decreases to 15 % with at least 5 traces. With 30 or more traces only less

than 5 % of the streets of the aforementioned types are covered. Generally it can be said, that

streets of higher priority have more matched GPS traces. Streets which can be used for bicycles

Page 72: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

64

perform as good as secondary and tertiary streets. Paths which are dedicated to pedestrians are

comparable to residential streets.

Figure 35: The coverage with GPS traces for different street types.

Besides the coverage, also the density of GPS track points within a GPS trace is examined, using

the preprocessed data set. The term density refers to the distance between two adjacent GPS track

points. A shorter distance between two track points also means that there are more GPS track

points per street segment to calculate the incline. Due to the noise in the GPS data, more points will

make the result more robust. The interval, in which GPS track points are recorded, can usually be

set in the settings of the device. It can be either time-dependent (e.g. every second) or location-

dependent (e.g. every 30 m). Figure 36 shows a diagram of the average distance between two

consecutive track points, distinguished by street type. The type ‘motorways’ is the the street type

with the largest average distance between two points, with approximately 35 m, followed by

‘primary’ and secondary streets (approximately 17 m). In the middle the street types ‘cycleway’

and tertiary have an average distance of almost 14 m. This value also reflects the overall distance.

The street types with the shortest distance are ‘residential’, footway’ and ‘path’. Like in the

investigation of the coverage, a dependency between the distance and the priority of street types

and the average speed can be seen. Whereas a motorway is likely to be the street type with the

highest speed, a footway or path is the one with the lowest since they are used only by pedestrians.

This leads to the assumptions that the main part of the track points are recorded with a time-

dependent interval which results in different distances between two adjacent GPS track points.

With an average distance between two adjacent GPS track points of 14 m, two track points fall into

a one pixel of the SRTM-1 DEM, which is equivalent to the horizontal resolution of 30 m by 30 m.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 5 10 15 20 25 30

Shar

e o

f ls

tre

et

len

gth

Minimum number of traces

cycleway

footway

motorway

path

primary

residential

secondary

tertiary

Page 73: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

65

Figure 36: Average distance of two adjacent GPS track points differentiated by street type.

5.2 Analysis of Calculated Incline

The GPS incline was calculated for a total length of 3064 km street network in the pilot region.

Due to the incomplete coverage of the OSM GPS traces this corresponds to approximately 57 % of

the complete street network, which has a total length of 5338 km. The map in Figure 37 visualizes

the calculated GPS incline for the street network within the pilot region. Streets for which no

GPS incline was calculated are not shown. It can be seen, that the western part of the region is

mainly flat, whereas the eastern part is mountainous. In section 5.1.1, the achieved accuracy of the

calculated GPS incline is examined using the DTM incline. In addition, the accuracy of the

GPS incline will be compared to the incline derived from the SRTM-1 DSM in section 5.2.3. This

will show how the approach of this research performs in comparison to other open-licensed and

globally available data.

Figure 37: Visualization of the GPS incline. Streets with no coverage are not displayed. (Map: OSM)

8 9

10 14 14 14

16 18

36

0,00 5,00 10,00 15,00 20,00 25,00 30,00 35,00 40,00

pathfootway

residentialtertiary

cyclewayoverall

secondaryprimary

motorway

Average distance between two adjacent track points

stre

et

typ

e

Page 74: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

66

5.2.1 Exclusion of data from the evaluation

The error of incline may be influenced by irregularities of the DTM. Figure 38 shows a motorway

junction with the underlying DTM for which the DTM incline is calculated erroneously, with a

value of 30 %. It shows a motorway junction with the underlying DTM. The phenomenon

especially happens at bridges or underpasses, since bridges are partly removed from the DTM. Due

to the characteristics of the pilot region, streets with an incline of more than 20 % are likely to exist

only rarely. But for over 20 km of the street segments, the DTM incline is above 20 % and in some

cases even over 100 %. 20 km corresponds to not even 1 % of the entire street network, but still

influences the result due to the high magnitudes. Therefore all street segments which have a

DTM incline greater than 20 % will be excluded from the evaluation to make sure that the result is

not distorted by wrong reference data. Furthermore, street segments with a GPS incline and

SRTM incline of over 35 % are not considered for this evaluation. The steepest street in the world,

the Baldwin Street in Dunedin (New Zealand), has a maximum incline of approximately 35 %55

.

Apart from a few and small paths in a mountainous region, streets with inclines over 35 % can

practically not exist and are consequently considered as wrong calculation and will also not be used

for this evaluation.

Figure 38: Erroneously calculated DTM incline, due to irregularities of the LiDAR DTM.

5.2.2 Accuracy of GPS incline

The comparison of the GPS incline and DTM incline is realized through the calculation of the

difference. It results in the incline error em which is given in percent. Following formula was used:

𝑒𝑚 = 𝑚𝐺𝑃𝑆 − 𝑚𝐷𝑇𝑀 ,

where

𝑒𝑚 = incline error in %

𝑚𝐺𝑃𝑆 = incline, derived from GPS traces

𝑚𝐷𝑇𝑀 = incline, derived from DTM

55

https://de.wikipedia.org/wiki/Baldwin_Street, checked on 20/07/2015

Page 75: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

67

In the following sections, the error will be be evaluated overall, differentiated by land use classes as

well as by different types of terrain such as flat and mountainous. Furthermore, it will be investi-

gated if the number of traces which has been used to derive the GPS incline affects the accuracy.

5.2.2.1 Overall error

Figure 39 shows the street network colorized by the incline error. There are 5 error classes, starting

from smaller than 1% up to greater than 5 %, shown in a gradient from green (small error) to red

(large error). Due to the incomplete coverage of GPS traces, the incline could not be calculated for

all streets. Those are not shown in the map. It can be seen from the map, that street segments

having a medium or large error are not equally distributed in this area. It is noticeable that on the

western part only a few and short street segments are colored in yellow or red. As stated in section

4.1 the western part is mainly characterized by flat terrain and farmland. Contrary to the western

part, in the eastern part more streets with a larger error can be found. This area is part of the

“Odenwald”, a mountainous region with mountains up to 600 m and mainly forested areas.

Figure 39: Visualization of the error of GPS incline in the pilot region. (Map: OSM)

Page 76: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

68

In the following, the distribution of the incline error is investigated. The histogram in Figure 40

shows the distribution of the incline error in percent within a range of ± 13 %. The vertical axis

represents the relative frequency of error values, falling into bins, which have a width of 0.5 %.

Errors with a magnitude greater than 13 % are too few to visualize them and are therefore not

shown in the histogram. The error of the incline appears to be normally distributed around the

mean of -0.05 m. The standard deviation is σ = 3.97 %, which means that approximately 68 % of

the incline errors are within the range of ± σ. When recalculating the standard deviation with only

95 % of the data, it results in σ = 2.31 %, which is almost as half as much as calculated with 100 %

of the data. This means, that there are GPS incline values which have been calculated with a large

error.

Figure 40: Histogram of the overall incline error in percent and the bell-curve (red).

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

-13-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13

Fre

qu

en

cy

Incline Error [%]

Page 77: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

69

Table 7 shows the absolute length in kilometers of the street segments for which the incline was

calculated below the error. In addition, the percentage of the total length of street segments for

which it was possible to derive incline information (3064 km). Depending on the application which

uses incline information, the acceptable error may be different. The higher the acceptable error the

more street segments are available. For almost two-thirds (61 %) of the street network the

GPS incline can be derived with an error smaller than 1 %. This increases to 85 % considering an

error up to 3 % or even to 92 % if an error smaller than 5 % can be accepted.

Incline Error em absolute length of

street segments

share of length of

street network with

GPS incline

< 1% 1872 km 61 %

< 2 % 2370 km 77 %

< 3 % 2607 km 85 %

< 4 % 2731 km 89 %

< 5 % 2817 km 92 %

Table 7: The length of street segments for different incline error classes.

5.2.2.2 By Land Use Classes

Like in section 5.1.1, the influence of the land use classes on the accuracy of the GPS incline is

investigated. The standard deviation has been calculated for each land use class and is shown in

Table 8. In addition, the length in kilometer and the percentage of the street segments within the

land use classes is given. The land use class ‘forest’ has the highest standard deviation with σ =

5.6 %, whereas for street segments running through farmland and industrial areas the GPS incline

could be calculated with the best accuracy (σ = 2.2 % / 2.3 %). The standard deviation of the other

land use classes, are in the middle with σ = 3.0 for ‘allotments’ to σ = 3.8 % in residential areas.

The reason may be the different characteristics of the land use classes with regard to the obstruction

of the signals from the satellite to the GPS receiver. Like in the previous section, the standard

deviation can be improved for all land use classes using only 95 % of the data. This shows that in

all land use classes, GPS incline values with errors of high magnitude can be found since the

standard deviation decreases in all cases.

Page 78: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

70

Land Use Class Std Dev. [%] Std Dev. 95 % [%] length

[km] length [%]

overall 4.0 2.3 3064 100

forest 5.4 3.6 1072 35

residential 3.8 2.1 844 28

farmland 2.2 1.1 560 18

grass 3.7 2.0 187 6

allotments 3.0 1.7 102 3

commercial 3.1 1.6 53 2

industrial 2.3 1.7 48 2

Table 8: The achieved accuracy of GPS incline differentiated by land use classes.

The achieved accuracies of the GPS incline differentiated by land use classes reflect the relative

accuracy of the GPS track points, evaluated in section 5.1.1.2. Forested areas perform worst,

whereas the land use class ‘farmland’ is one of the land use classes with the highest relative

accuracy. Differences compared to the relative accuracy of the GPS raw data can be found for

‘grass’ and ‘industrial’. For the land use class ‘grass’, the relative accuracy of the GPS track points

is the best, whereas the incline could only be determined with medium accuracy. In industrial areas,

the opposite case can be observed. For street segments within industrial areas the GPS incline could

be calculated with one of the best accuracies. Contrary, the relative accuracy of the GPS traces

within industrial areas is not as good. The reason may be found in the lack of data. Street segments

within the land use classes ’grass’, ‘allotments’, ‘commercial’ and industrial’ are only a small part

of the entire street network of street segments with calculated GPS incline. This may happen due to

missing GPS information in these areas or due to fewer streets. The result would be more reliable

and robust if more data would be present.

5.2.2.3 By Terrain Classes (mountainous / flat)

Besides the influence of the land use class on the achieved accuracy, it is also evaluated if the

terrain affects the accuracy of the GPS incline. It shall be differentiated between street segments in

flat and mountainous areas. A street segment is considered as flat if the DTM incline is smaller

than 2 %. Streets with an incline of over 5 % are considered as being in a mountainous area. The

results are shown in Table 9. The length of the flat street segment sums up to 2018 km which

represents the 66 % of all street segments for which the GPS incline could be calculated. The

length of streets in mountainous areas accumulates to 603 km, which corresponds to 20 %.

Consequently, the missing 14 % are street segments with a DTM incline ranging from 2 % to 5 %.

This part of street segments is not considered in this section.

Page 79: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

71

As already stated in sections 5.2.2.1 and 5.2.2.2, the overall standard deviation is σ = 4.0 %. Within

flat areas the standard deviation is 2.8 %, whereas in mountainous areas a standard deviation of

more than 7 % is calculated. Using only 95 % of the data it can be improved to σ = 1.4 % (flat) and

σ = 5.7 % (mountainous). Since the standard deviation of the GPS incline value of street segments

within mountainous is worse more than 3 times, it can be said that the incline within flat areas can

be determined with higher accuracy than in mountainous areas. However, it has to be noted that the

majority (73 %) of the street segments in mountainous areas run through forested areas, whereas

only 19 % of streets in flat areas fall in forested areas. Thus, the result may also be influenced by

the poor accuracy of incline values within forests (cf. section 5.2.2.2).

Terrain Class Std Dev. [%] Std Dev. 95 % [%] length [km] length [%]

overall 4.0 2.3 3064 100

flat 2.8 1.4 2018 66

mountainous 7.2 5.7 603 20

Table 9: The achieved accuracy of GPS incline differentiated by terrain classes.

5.2.2.4 Effect of Number of GPS Traces on Overall Accuracy

In the previous sections, the investigations have been made considering all street segments for

which the GPS incline could be calculated and not depending on the number of traces which have

been used to calculate the incline. As described in section 0, the GPS incline is calculated out of the

elevation differences of the track points of a GPS trace. If a street segment has multiple traces, the

incline is calculated individually for each trace. The incline values of the individual traces are then

aggregated by calculating the average. It is now evaluated if the accuracy increases with the

number of GPS traces per street segment. The diagram in Figure 41 shows the percentage of the

street network with an incline error smaller than 2 % in blue, considering the usage of a certain

number of GPS traces. 100 % is equivalent to the sum of the lengths of all street segments having

at least the number of matched GPS traces. The red line shows the share of the entire street network

including also those street segments for which the GPS incline could not be derived.

It was possible to derive the GPS incline with an error smaller than 2 % for 2370 km (77 %) of the

street length considering street segments with at least one GPS trace. This is equivalent to 44 % of

the total street network. When neglecting street segments with less than 5 GPS traces, 1128 km of

the street network are covered. Out of these 1128 km, 87 % of the street length has a GPS incline

with an error smaller than 2 %. Compared to the usage of at least 1 trace, the percentage of streets

increases, however, the coverage compared to the total street network decreases significantly to

less than 20 %. This trend continuous the more traces are used. When considering street segments

with at least 30 traces, the percentage of street length with an error smaller than 2 % increases to

Page 80: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

72

98 %, although with only 133 km (corresponding to only 2.5 % of the total street network in the

pilot region) there are not many street segments covered with at least 30 GPS traces.

Figure 41: The percentage of streets, with an incline error smaller than 2 % and their share with respect to the

entire street network.

5.2.3 Comparison GPS incline and SRTM incline

The derived GPS incline using the approach presented in this thesis is now compared to the SRTM-

1 DSM, as this is an alternative data source for deriving incline information. As described

previously in section 4.3.4 the DSM is freely available with a horizontal resolution of approximate-

ly 1 arcsecond, which is equivalent to approximately 30 m at the equator.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 5 10 15 20 25 30

Minimum Number of Traces

share of streetlength with GPSincline error <2% [%]

share of totalstreet network[%]

Page 81: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

73

The SRTM incline of the street network was calculated as described in section 4.5 and the standard

deviation has been calculated like it was done for the GPS incline. This gives a comparable

measure which is suitable to indicate how the accuracy of the GPS incline performs in comparison

to the SRTM incline. Table 10 gives an overview of how much of the street network, the incline

can be determined with an error smaller than 2 % considering both data sources (GPS and SRTM).

The numbers for the GPS incline are taken from the diagram in section 5.2.2.4.

Percentage of street network with

incline error smaller < 2 % coverage

SRTM 73 % 100 %

GPS, ≥ 1 trace 77 % 44 %

GPS, ≥ 5 traces 86 % 18 %

Table 10: Comparison of SRTM and GPS incline in terms of amount of street network with an incline error

smaller than 2 %.

Using SRTM, it is possible to derive the incline with an error smaller than 2 % for 2236 km out of

3064 km (73 %) of the street network. When using GPS traces, this depends on the minimum

number of GPS traces which are used for the determination of incline. Considering street segments

with at least 1 trace, 77 % percentage of the street network can be determined with an incline error

smaller than 2 %. When neglecting streets with less than 5 GPS traces, the percentage increases to

86 %. However, this requires the coverage of enough GPS traces, which is the case in only 18 % of

the entire street network in the pilot region. To summarize, it can be said that GPS incline performs

slightly better than the SRTM incline, however, the coverage is more complete with SRTM. But it

has to be noted as well, that GPS traces can always be collected by volunteers, thus the coverage

may get higher.

Besides the length of the street of which the incline was derived within a certain error range, the

standard deviation is compared in the following sections. Firstly, the standard deviations of the

GPS and SRTM incline are differentiated by land use class and secondly by terrain classes.

5.2.3.1 By Land Use Classes

Table 11 shows the comparison of the standard deviations by land use classes. The standard

deviations have been calculated using 95 % of the data. Besides the standard deviation of the

SRTM incline σSRTM, the one of the GPS incline σGPS (cf. section 5.2.2.2) and their difference is

shown. Additionally, the standard deviation has been calculated from the error values of the

GPS incline, considering only those street segments, which have at least 5 GPS traces (σGPS 5T). The

difference between σSRTM and σGPS 5T is given in the last column. Overall, σSRTM is with 3.1 %,

Page 82: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

74

0.8 % larger than σGPS and even 1.5 % larger than σGPS 5T. That means that the GPS incline can be

derived with less uncertainty, especially when neglecting street segments with less than 5

GPS traces. This holds true within all land use classes. Considering the GPS incline, derived from

at least 5 traces, the difference of the standard deviations is for all land use classes larger than 1 %,

reaching almost 2 % in the land use class ‘grass’.

Land Use

Class σSRTM

[%] σGPS

[%] σGPS - σSRTM

[%] σGPS 5T

[%] σGPS 5T - σSRTM

[%]

overall 3.1 2.3 -0.8 1.6 -1.5

forest 4.2 3.6 -0.6 3.1 -1.1

farmland 2.0 1.1 -0.9 0.9 -1.1

residential 2.8 2.1 -0.7 1.5 -1.3

commercial 2.9 1.6 -1.3 1.5 -1.4

allotments 2.6 1.7 -0.9 1.3 -1.3

grass 3.3 2.0 -1.3 1.5 -1.8

industrial 2.6 1.7 -0.9 1.0 -1.6

Table 11: Comparison of the standard deviations of the incline error, overall and differentiated by land use

classes.

The SRTM incline may perform worse because of several reasons. Firstly, the SRTM-1 DEM is a

DSM which means that all structures on the earth surface are not reduced from the elevation

information. This normally does not matter, since streets, apart from those in forests or under

bridges, are hardly covered by trees or man-made structures. But due to the low horizontal

resolution of 30 m this becomes a problem. If one square (or pixel) of the DSM covers not only the

street, but also building and trees which are right next to the streets, the value of this square is an

average elevation of the ground and the other structure. Contrary to SRTM, the GPS track points

are recorded on the earth surface or to be more precise on a constant height above the ground (in

the car or in the back pack). In addition to this problem, the SRTM data suffers from a vertical

accuracy of 6.2 m, which is a relatively large error in comparison to the relative accuracy of GPS

with 0.6 m within a distance of 30 m (cf. section 5.1.1.2).

Page 83: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

75

5.2.3.2 By Terrain Classes

The standard deviations are now differentiated between flat areas (DTM incline < 2 %) and

mountainous areas (DTM incline > 5 %). As shown in Table 12, the SRTM incline performs better

in flat areas (σSRTM=2.5 %) than in mountainous areas (σSRTM=5.1 %). The GPS incline performs

better as the SRTM incline in flat areas, whereas in mountainous areas the SRTM incline is slightly

better. This is surprising since the GPS incline could be derived more accurately overall, within all

land use classes as well as in flat areas. The reason why the SRTM incline is slightly better may not

be because the SRTM incline could actually be determined more accurately, but the GPS incline

performs in mountainous regions extraordinary badly.

Terrain

σSRTM

[%] σGPS

[%] σGPS - σSRTM

[%] σGPS 5T [%]

σGPS 5T - σSRTM

[%]

overall 3.1 2.3 -0.8 1.6 -1.5

flat 2.5 1.4 -1.1 1.0 -1.5

mountainous 5.1 5.7 0.6 5.4 0.3

Table 12: Comparison of the standard deviations of the incline, overall and differentiated by terrain classes.

5.3 Limitations of Approach

As describes in section 0, the approach of deriving incline information from user-generated

GPS traces results in an incline with a reasonable accuracy, however, due to the methodology there

are also limitations. In this section, these limitations will be discussed critically.

In the OpenStreetMap Wiki (2015e), a convention regarding the incline of streets is given. When

adding incline information to OSM-Ways, the street segment shall be split at the beginning and at

the end of the inclined part. The value which is then added to the key ‘incline’, should represent the

maximum value which can be found within this part of street rather than the average incline. But

using the approach of this thesis, the average incline is calculated per street segment. In the

preprocessing, the street segments were split at their intersection points. However, those parts of

the streets in which there is an incline are not detected. Thus, the geometry objects cannot be split

at the beginning and at the end of the inclined part of the street.

The calculation of the average incline per street segment and that the steepest parts are not

detected, does not result in a problem as long as the street segment contains a constant incline over

the length of the street segment. However, in reality there are situations in which this approach

leads to wrong results. Figure 42 shows two examples, (a) and (b). In (a) the street segment

contains 3 parts with different inclines. Two of them are flat, whereas the one in the middle is

Page 84: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

5 Discussion of Results

76

inclined. The average incline, calculated for this street segment, results in a value which is lower

than the incline in reality. This leads to a problem if a person expects for example a 5 % incline

along a distance of 100 m distance and faces in reality a 10 % incline within a distance of 50 m. At

least, in this case it is known that there is an incline. The example in Figure 42(b) shows a situation

in which the average incline results in 0 %, since the street segments contains two inclined parts

with the same magnitude but in the opposite direction.

Figure 42: Situations where the calculated incline differs from the steepest incline.56

This problem was not addressed in this thesis as the main focus was on the examination of user-

generated GPS traces with regard to their feasibility of deriving incline information as well as the

development of a method and tools which handle the GPS data and process them to derive incline

information.

56

Bicycle pictogram: © Pixabay-User: ‘ClkerFreeVectorImages‘, Source of image:

https://pixabay.com/de/fahrrad-piktogramm-sport-307977/, checked on 20/07/2015

(a) (b)

Page 85: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

77

6 Conclusion and Outlook

6.1 Conclusion

Different user-groups may benefit from routing planning which considers the incline of a street

network. There are for example mobility-restricted people, such as wheelchair users, people with

walking aids or even parents with push chairs, for whom streets or paths may be inaccessible if

there is an incline of certain magnitude. Knowing the incline in advance, a route can be planned

with avoiding steep streets. The chosen route may be longer, but not as steep as the shortest one.

Furthermore, it is useful for route planning of electricity-powered vehicles or bicycles.

The data of the OpenStreetMap project, which is a freely available source of street network data

and often used by routing engines, does only provide incline information for 0.2 % of the street

network. Therefore, the automatic derivation of incline values may fill the gap. One source of

elevation information to derive incline information for a street network may be digital elevation

models (DEMs). There are DEMs acquired from LiDAR-measurements. These are very accurate,

however, there are usually also very expensive and not globally available. Alternatively, low-cost

DEMs like SRTM-1 DEM or ASTER are freely and (almost) globally available but are limited

through their horizontal resolution of 30 m and vertical accuracy of 9 m (ASTER GDEM) and 6 m

(SRTM-1 DEM). Another source of elevation data, which is freely available and at least theoreti-

cally globally available, are user-generated GPS traces of the OpenStreetMap project. Initially

collected for the purpose of map making, the data might also serve other purposes. Contrary to

SRTM-1 and ASTER and depending on the coverage, many GPS track points may fall within a

square of 30 m, which is the horizontal resolution of SRTM-1 DEM and ASTER DGEM. There-

fore, there is more information about the elevation which potentially results in incline values of

higher accuracy, although the absolute vertical accuracy of GPS is known to be fairly poor. But

rather than the absolute elevation, only the difference in elevation of two adjacent points is of

relevance. The relative accuracy is assumed to be better than the absolute one.

Page 86: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

78

The following aims for this thesis have been formulated:

- Creation and implementation of a workflow to calculate the incline of streets, using user-

contributed GPS traces.

- Assessment of the quality of voluntary collected GPS traces in terms of

o vertical accuracy (absolute and relative)

o coverage of GPS traces

- Assessment of the achieved quality of the incline information, compared to LiDAR and

SRTM-1 DEM.

- Publication of developed software as Open Source and provision to the OpenStreetMap

community

The steps to fulfill the aforementioned aims will be discussed in the following.

Before calculating the GPS incline for the segments of the street network, different steps have to be

undertaken to prepare the two main input data sets, the GPS traces and the street network. The GPS

traces are downloaded from the OpenStreetMap (OSM) project and include over 4000 traces in the

pilot region (Heidelberg Area / Germany). Not all of the traces have the optional elevation

information, therefore, only 3842 traces with over 2 million GPS track points remain to derive the

incline. To import the GPS traces which can be downloaded as compressed file-archive (*.tar.xz), a

Java-Tool has been developed. It reads the file-archive and filters the GPS traces by bounding box,

rejects all traces without elevation information and stores the traces in the database. Since, the

elevation information of the GPS traces suffer from noise and other irregularities, they have to be

preprocessed in the following step. The street network which is going to be enhanced with incline

information, has also been taken from OSM. The data has been imported to the database using the

Java-tool Osmosis. For the pilot region the street network has a total length of 5338 km, containing

different types of streets. It includes for example residential streets (18 %) and motorways (2 %)

but also paths which are exclusively dedicated to pedestrians or cyclists (together 20 %). Like the

GPS traces, also the street network has been preprocessed. The streets have been split at their

intersection points with other streets to avoid long streets which may span several valleys and hills.

It is considered that the GPS traces were recorded while traveling on a street, which is important

for the next step. For the incline calculation, the assignment of the GPS traces to the street

segments (map matching) is an essential step. The assumption has been made, that streets which

are parallel to each other (e.g. street with two separate lanes, footpath next to street) also have the

same incline. Consequently, GPS traces which were recorded on one of the parallel streets can also

be used for the incline calculation of the other one. This increases the number of traces per street,

however, this assumption may also lead to errors if two parallel streets have different inclines.

Page 87: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

79

The GPS incline calculation was done for each street segment individually. First of all, the

previously assigned GPS traces of the street are selected. A buffer of the street segments with a size

of 15 m is then used to clip the selected traces. Only those parts of the traces which fall into the

buffer shall be used to calculate the incline. After that, the incline for each GPS trace is calculated

by averaging all inclines derived from every two adjacent GPS track points. If there are multiple

traces per street segment, the procedure is repeated for the other ones as well. At the end, the

incline values of all traces are averaged to the final incline of the street segment.

The second aim of this thesis is the evaluation of the GPS raw data with regard to the absolute and

relative accuracy considering a LiDAR DTM as reference. Overall, the RMSE (using 90 % of the

data) of the GPS-elevation is 27 m and depending on the land use class it ranges from 21 m for

GPS track points in farmland and 35 m in the land use class ‘allotments’. This is worse than stated

in the literature in which the accuracy of low-cost GPS receivers has been assessed. It may happen,

that smartphone apps use elevation databases which rely on DEMs, such as SRTM-1. Furthermore,

some handheld devices have a barometric measurement unit. Furthermore, the GPS elevation refers

in many cases to the mean sea level, although it should be given as the height over the WGS 84

ellipsoid. Only some GPS track points are referred to the ellipsoid. This means that many

smartphone applications or handheld GPS devices internally transform the ellipsoidal to geoidal

height with the help of a geoid model.

To judge the relative accuracy, the RMSE of the elevation differences between two adjacent GPS

track points has been calculated. Overall, the RMSE is 0.3 m, however, depending on the land use

class the RMSE ranges from 0.2 m to 0.5 m. Land use classes which are characterized by mainly

fields and almost no buildings like ‘farmland’ or ‘allotments’ perform with an RMSE of 0.2 m

better than others which are characterized by tall buildings or a dense tree canopy such as residen-

tial areas or forests (RMSE = 0.3 m / 0.5 m). Combining the RMSE with the average distance

between the points, it results in an incline error of 2.4 % overall, 1.0 % for farmland and 4.3 % for

forested areas. With an incline accuracy of 2.4 % it is possible to derive incline out of GPS traces

with a reasonable accuracy, especially considering that street often are covered with traces.

Besides the absolute and relative accuracy, the coverage of GPS traces and density of GPS track

points has been evaluated. The coverage was investigated by street type. When considering all

street types which are used by cars, it can be said, that in the pilot region streets of higher priority

also have a higher coverage. With at least one GPS trace, 100 % of the motorways, primary and

secondary streets are covered, while residential streets are only covered with 60 %. Considering the

coverage of at least 5 GPS traces, still almost 100 % of the motorways and primary streets, 82 % of

the secondary and only 14 % of the residential streets are covered. The street types ‘path’ and

‘footway’, which are used by pedestrians and also mobility-restricted people are comparably

Page 88: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

80

covered than residential streets. Cycle ways even have a better coverage, which can be compared to

secondary streets. This shows that the contributors of OSM are not only traveling by car, there are

also many paths covered which are dedicated to pedestrians. This is of benefit, considering the

motivation of this thesis of calculating the incline for mobility-restricted people.

The overall distance between two adjacent points of the GPS traces is 14 m. Compared to the

horizontal resolution of SRTM-1 DEM, this is as twice as high and 2 GPS track points theoretically

fall into one pixel of the DEM. This results in a higher horizontal resolution, especially if a street is

covered by multiple. The distance between two GPS track points depends on the average speed on

the street type. For example is the average distance on motorways 36 m, while on foot ways the

average distance is only 9 m. It implies, that most devices record the GPS track points in a time-

dependent interval.

The third aim of this thesis is to validate the result, using incline derived from the high-accuracy

DTM as reference and the SRTM-1 DSM. The incline was calculated for 3064 km street length

which is equivalent to 57 % of the entire street network within the pilot region. Out of this street

length, 61 % have an incline error smaller than 1 %, which is probably a sufficient accuracy for

most use-cases. For even 85 % (2607 km) of the street network, the incline could be calculated

with an error below 3 %, which still may reasonable for some use-cases. The normal distributed

incline error has a standard deviation of σ = 2.3 % (with 95 % of the data), but depending on the

land use classes σ is ranging from σ = 1.1 % (farmland) to σ = 3.6 % (forest). It is noticeable that

the incline is more accurate within land use classes which do not suffer from obstructions of the

satellite signal. If differentiating the incline accuracy by terrain classes, it was discovered that the

incline can be determined with higher accuracy in flat areas (σ = 1.4 %), whereas in mountainous

areas the accuracy is worse with σ = 5.7 %. However, the main part of the mountainous area is

characterized by forests, which also performs worse than other land use classes.

The accuracy can generally be improved with only considering street segments which are covered

by multiple traces. For example, if all street segments with a GPS incline are considered, 77 % of

the inclines are determined with an accuracy better than 2 %. With an increasing minimum number

of traces, the percentage of streets with an incline error below 2 % increases to 87 % (>5 traces)

resp. 92 % (>10 traces). However, the coverage gets significantly worse.

The GPS incline was compared to the incline derived from the SRTM-1 DSM to see how user-

generated GPS traces perform in comparison to other freely available data. The evaluation has

shown that the GPS incline performs slightly better than the SRTM incline. Using SRTM-1, the

incline could be determined with an error smaller than 2 % for 73 % of the street network. With

GPS traces this increases to 77 %, considering all street segments with at least 1 GPS trace and to

Page 89: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

81

86 % of the street with at least 5 GPS traces. However, the coverage of GPS cannot keep up with

the SRTM-1 DEM which is almost globally available. The percentage of streets with an GPS

incline smaller than 2 % is only 44 % (> 1 GPS trace) resp. 18 % (> 5 GPS traces).

To conclude it can be said, that it is possible to derive incline values of a street network in a

reasonable accuracy, if the streets are covered with multiple GPS traces. Especially in comparison

to SRTM-1 DSM, the GPS incline performs better, although the coverage is significantly lower.

With introducing other sources of user-generated GPS traces, the coverage can be improved. The

result shows, that it is nowadays possible to achieve a comparable or even slightly better results

with user-generated data, compared to data collected by a research satellite. However, user-

generated GPS traces also require satellites, but the data was not primarily collected for the purpose

of incline calculation.

6.2 Outlook

This approach has been tested in the area around Heidelberg. The advantage of this area is that

there is a diversity of land use classes as well as flat and mountainous areas. However, the

mountainous area is mainly covered by forests and the residential areas are often in flat terrain. To

do further tests regarding the dependency of mountainous and flat areas on the accuracy of

GPS incline, the approach could be applied to other pilot regions, for example Zürich in Switzer-

land, where a high-accuracy DTM is available as open data.

Furthermore, it can be tried in the future to improve the results of this approach. One way to

achieve better results is the introduction of other sources on top of OpenStreetMap. The data of

sport tracking platforms and projects such as Strava or gpsies.com could be combined with the

GPS traces from OSM. Unfortunately, some projects do not offer public access to the GPS traces,

but with requesting the data for a specific reason or setting up a cooperation, it might be possible to

get anonymized data. The larger amount of data would result in a higher coverage of GPS traces

which leads to a higher completeness of streets with GPS incline and to higher accuracies, since

there will be more streets with multiple corresponding traces. If there is a limited and relatively

small area of interest, it is also possible to collect the data only for this reason by volunteers.

Furthermore, a smartphone can be given to people who regularly drive or walk through the area

(e.g. couriers, taxi), to record their location the entire work day.

An additional field of further research would be to compute a digital terrain model out of user-

generated GPS traces. Massad & Dalyot (2015) already did investigations towards the generation

of a DTM, using GPS traces recorded from a smartphone. They tested their approach, which

includes a 2D Kalman filtering, on a university campus with data collected exclusively for this

Page 90: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

82

purpose and under good conditions. The measured GPS track points are relatively equally

distributed, since not only points on paths have been measured, but also on lawn and parking

spaces. The approach of Massad & Dalyot (2015) could be tested using the GPS data of OSM. It

offers a large amount of data, however, it also involves some challenges. In case of OSM

GPS traces, it is not known, which devices were used, to which vertical datum the elevation is

referred to and the traces are mainly recorded on streets or paths. The latter would lead to a gap of

data in the areas between the streets. Furthermore, streets often have multiple traces what leads to a

high density of points which may differ a lot in elevation due to their poor absolute accuracy.

Because of the aforementioned challenges it would be interesting to find out if the OSM GPS traces

are suitable for deriving a digital terrain model.

Page 91: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

83

7 Bibliography

Bachofer, F. (2011): Einfluss der vertikalen Genauigkeit von DGM aus das EcoRouting von

Elektrofahrzeugen. In J. Strobl, T. Blaschke, G. Griesebner (Eds.): Angewandte Geoin-

formatik 2011. Beiträge zum 23. AGIT-Symposium Salzburg. Berlin, Offenbach: Wich-

mann, pp. 338–346.

Bauer, C. A. (2010): User Generated Content – Urheberrechtliche Zulässigkeit nutzergenerierter

Medieninhalte. In H. Große Ruse-Khan, N. Klass, S. von Lewinski (Eds.): Nutzergenerierte

Inhalte als Gegenstand des Privatrechts, vol. 15. Berlin, Heidelberg: Springer, pp. 1–42.

Boucher, C. (2013): Fusion of GPS, OSM and DEM Data for Estimating Road Network Elevation.

In : Fifth International Conference on Computational Intelligence, Communication Systems

and Networks (CICSyN). Madrid, Spain, pp. 273–278.

Cartwright, W.; Gartner, G.; Meng, L.; Peterson, M. P.; Peckham, R. J.; Jordan, G. (2007): Digital

Terrain Modelling. Berlin, Heidelberg: Springer.

Conley, R.; Cosentino, R.; Hegarty, C. J.; Kaplan, E. D.; Leva, J. L.; Uijt de Haag, M.; Van Dyke,

K. (2006): Performance on Stand-Alone GPS. In E. D. Kaplan, C. Hegarty (Eds.): Under-

standing GPS. Principles and applications. 2nd edition. Boston: Artech House, pp. 301–378.

Cosentino, R. J.; Diggle, D. W.; Uijt de Haag, M.; Hegarty, C. J.; Milbert, D.; Nagle, J. (2006):

Differential GPS. In E. D. Kaplan, C. Hegarty (Eds.): Understanding GPS. Principles and

applications. 2nd edition. Boston: Artech House, pp. 379–458.

Czegka, W.; Braune, S.; Behrends, K. (2004): Die Qualität der SRTM-90m Höhendaten und ihre

Verwendbarkeit in GIS. 24. Wissenschaftlich-Technische Tagung der DGPF. Halle, 2004.

Ding, D.; Parmanto, B.; Karimi, H. A.; Roongpiboonsopit, D.; Pramana, G.; Conahan, T.;

Kasemsuppakorn, P. (2007): Design considerations for a personalized wheelchair navigation

system. In Conference proceedings: Annual International Conference of the IEEE Engineer-

ing in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society.

Annual Conference 2007, pp. 4790–4793.

empirica Gesellschaft für Kommunikations- und Technologieforschung mbH (2015): Welcome to

cap4access. Available online at http://cap4access.eu/intro/, checked on 1/15/2015.

European Space Agency (2015): What is Galileo? Available online at

http://www.esa.int/Our_Activities/Navigation/The_future_-_Galileo/What_is_Galileo,

checked on 4/23/2015.

Page 92: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

84

Farr, T. G.; Rosen, P. A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S. et al. (2007): The Shuttle

Radar Topography Mission. In Reviews of Geophysics 45 (2). DOI:

10.1029/2005RG000183.

Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. (1996): From Data Mining to Knowledge Discovery

in Databases. In U. M. Fayyad (Ed.): Advances in knowledge discovery and data mining.

Menlo Park: AAAI Press, pp. 1–34.

Feairheller, S.; Clark, R. (2006): Other Satellite Navigation Systems. In E. D. Kaplan, C. Hegarty

(Eds.): Understanding GPS. Principles and applications. 2nd edition. Boston: Artech House,

pp. 595–634.

Franke, D.; Dzafic, D.; Baumeister, D.; Kowalewski, S. (2012): Energieeffizientes Routing für

Elektrorollstühle. In : 13. Aachener Kolloquium Mobilität und Stadt (AMUS/ACMOTE):

RWTH Aachen, pp. 65–68. Available online at http://publications.embedded.rwth-

aachen.de/file/51, checked on 7/21/2015.

Goodchild, M. F. (2007): Citizens as sensors: the world of volunteered geography. In GeoJournal

69 (4), pp. 211–221. DOI: 10.1007/s10708-007-9111-y.

Hahmann, S. (2014): Zur Beziehung von Raum und Inhalt nutzergenerierter geographischer

Informationen. Dissertation. Technische Universität Dresden, Dresden. Institut für Kar-

tographie.

Haining, R. P. (2003): Spatial data analysis. Theory and practice. Cambridge, UK, New York:

Cambridge University Press.

Haklay, M.; Weber, P. (2008): OpenStreetMap. User-Generated Street Maps. In IEEE Pervasive

Computing 7 (4), pp. 12–18. DOI: 10.1109/MPRV.2008.80.

Han, J.; Kamber, M. (2006): Data mining. Concepts and techniques. 2nd ed. Amsterdam, Boston,

San Francisco, CA: Elsevier; Morgan Kaufmann (The Morgan Kaufmann series in data

management systems).

Han, S.; Rizos, C. (1999): Road Slope Information from GPS-Derived Trajectory Data. In Journal

of Surveying Engineering 125 (2), pp. 59–68.

Harriehausen-Mühlbauer, B. (2014): Mobile Navigation for Limited Mobility Users. In D.

Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, A. Kobsa, F. Mattern et al. (Eds.): Digital

Human Modeling. Applications in Health, Safety, Ergonomics and Risk Management, vol.

8529. Cham: Springer International Publishing (Lecture Notes in Computer Science),

pp. 535–545.

Page 93: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

85

Heipke, C. (2010): Crowdsourcing geospatial data. In ISPRS Journal of Photogrammetry and

Remote Sensing 65 (6), pp. 550–557. DOI: 10.1016/j.isprsjprs.2010.06.005.

Hofmann-Wellenhof, B.; Lichtenegger, H.; Wasle, E. (2008): GNSS - Global Navigation Satellite

Systems. GPS, GLONASS, Galileo, and more. Wien, New York: Springer.

Jokar Arsanjani, J. (2014): Case study I: VGI platforms and data generalization. In D. Burghardt,

C. Duchêne, W. Mackaness (Eds.): Abstracting Geographic Information in a Data Rich

World. Cham: Springer International Publishing (Lecture Notes in Geoinformation and Car-

tography), pp. 131–138.

Karussel (2014): Digitalizing GPX Points or How to Track Vehicles With GraphHopper. Available

online at https://karussell.wordpress.com/2014/07/28/digitalizing-gpx-points-or-how-to-

track-vehicles-with-graphhopper/, updated on 7/28/2014, checked on 5/8/2015.

Kono, T.; Fushiki, T.; Asada, K.; Nakano, K. (2008): Fuel Consumption Analysis and Prediction

Model for “Eco” Route Search. In : 15th World Congress on Intelligent Transport Systems

and ITS America's 2008 Annual Meeting.

Kurihara, M.; Nonaka, H.; Yoshikawa, T. (2004): Use of highly accurate GPS in network-based

barrier free street map creation system. In : IEEE International Conference on Systems, Man

and Cybernetics. The Hague, Netherlands, Oct. 10-13, 2004, pp. 1169–1173.

Langley, R. B. (1999): Dilution of Precision. In GPS World (10 (5)), pp. 52–59.

Liu, G.; Hossain, K. M. A.; Iwai, M.; Ito, M.; Tobe, Y.; Sezaki, K.; Matekenya, D. (2014): Beyond

horizontal location context: Measuring Elevation Using Smartphone’s Barometer. In : Pro-

ceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous

Computing: Adjunct Publication. New York, USA, pp. 459–468.

Marchal, F.; Hackney, J.; Axhausen, K. (2005): Efficient Map Matching of Large Global Position-

ing System Data Sets: Tests on Speed-Monitoring Experiment in Zürich. In Transportation

Research Record 1935 (1), pp. 93–100. DOI: 10.3141/1935-11.

Massad, I.; Dalyot, S. (2015): Towards the production of digital terrain models from volunteered

GPS trajectories. In Survey Review. DOI: 10.1179/1752270615Y.0000000010.

Menkens, C.; Sussmann, J.; Al-Ali, M.; Breitsameter, E.; Frtunik, J.; Nendel, T.; Schneiderbauer,

T. (2011): EasyWheel - A Mobile Social Navigation and Support System for Wheelchair

Users. In : Eighth International Conference on Information Technology: New Generations

(ITNG). Las Vegas, NV, USA, pp. 859–866.

Page 94: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

86

Mennis, J.; Guo, D. (2009): Spatial data mining and geographic knowledge discovery—An

introduction. In Computers, Environment and Urban Systems 33 (6), pp. 403–408. DOI:

10.1016/j.compenvurbsys.2009.11.001.

Meyer, D. J. (2011): ASTER Global Digital Elevation Model Version 2 – Summary of Validation

Results. Available online at

https://www.jspacesystems.or.jp/ersdac/GDEM/ver2Validation/Summary_GDEM2_validati

on_report_final.pdf, checked on 1/15/2015.

Müller, A.; Neis, P.; Auer, M.; Zipf, A. (2010): Ein Routenplaner für Rollstuhlfahrer auf der Basis

von OpenStreetMap-Daten - Konzeption, Realisierung und Perspektiven. In J. Strobl, T.

Blaschke, G. Griesebner (Eds.): Angewandte Geoinformatik 2010. Beiträge zum 22. AGIT-

Symposium Salzburg. Berlin, Offenbach: Wichmann.

Neis, P.; Zielstra, D. (2014a): Generation of a tailored routing network for disabled people based

on collaboratively collected geodata. In Applied Geography 47, pp. 70–77. DOI:

10.1016/j.apgeog.2013.12.004.

Neis, P.; Zielstra, D. (2014b): Recent Developments and Future Trends in Volunteered Geographic

Information Research. The Case of OpenStreetMap. In Future Internet 6 (1), pp. 76–106.

DOI: 10.3390/fi6010076.

Neis, P.; Zielstra, D.; Zipf, A. (2012): The Street Network Evolution of Crowdsourced Maps.

OpenStreetMap in Germany 2007–2011. In Future Internet 4 (4), pp. 1–21. DOI:

10.3390/fi4010001.

Open Knowledge Foundation (2015): ODC Open Database License (ODbL) Summary. Available

online at http://opendatacommons.org/licenses/odbl/summary/, checked on 4/20/2015.

OpenStreetMap Foundation Wiki (2015a): About. Available online at

http://wiki.osmfoundation.org/w/index.php?title=About&oldid=3201, updated on 4/1/2015,

checked on 4/20/2015.

OpenStreetMap Foundation Wiki (2015b): License/We Are Changing The License. Available

online at

http://wiki.osmfoundation.org/w/index.php?title=License/We_Are_Changing_The_License

&oldid=1813, updated on 4/1/2015, checked on 4/20/2015.

OpenStreetMap Foundation Wiki (2015c): Working Groups. Available online at

http://wiki.osmfoundation.org/w/index.php?title=Working_Groups&oldid=2220, updated on

4/1/2015, checked on 4/20/2015.

Page 95: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

87

OpenStreetMap Wiki (2015a): Bing. Available online at

http://wiki.openstreetmap.org/w/index.php?title=Bing&oldid=1117458, updated on

4/16/2015, checked on 4/18/2015.

OpenStreetMap Wiki (2015b): Map Features. Available online at

http://wiki.openstreetmap.org/w/index.php?title=Map_Features&oldid=1178564, checked on

5/27/2015.

OpenStreetMap Wiki (2015c): Stats. Available online at

http://wiki.openstreetmap.org/w/index.php?title=Stats&oldid=1145799, updated on

4/9/2015, checked on 4/15/2015.

OpenStreetMap Wiki (2015d): User:Ikonor/DE:SRTM Alternativen / DGM – OpenStreetMap

Wiki. Edited by OpenStreetMap Wiki. Available online at

http://wiki.openstreetmap.org/w/index.php?title=User:Ikonor/DE:SRTM_Alternativen_/_DG

M&oldid=1160583, checked on 7/11/2015.

OpenStreetMap Wiki (2015e): Key:incline. Available online at

http://wiki.openstreetmap.org/w/index.php?title=Key:incline&oldid=1148320, checked on

5/27/2015.

Quddus, M. A.; Ochieng, W. Y.; Noland, R. B. (2007): Current map-matching algorithms for

transport applications: State-of-the art and future research directions. In Transportation Re-

search Part C: Emerging Technologies 15 (5), pp. 312–328. DOI: 10.1016/j.trc.2007.05.002.

Ramm, F.; Topf, J. (2010): OpenStreetMap. Die freie Weltkarte nutzen und mitgestalten.

3. Auflage. Berlin: Lehmanns Media.

Ramm, F.; Topf, J.; Chilton, S. (2011): OpenstreetMap. Using and enhancing the free map of the

world. English ed. Cambridge, England: UIT Cambridge.

Resch, B. (2013): People as Sensors and Collective Sensing - Contextual Observations Comple-

menting Geo-Sensor Network Measurements. In J. M. Krisp (Ed.): Progress in location-

based services. Heidelberg, New York: Springer (Lecture Notes in Geoinformation and Car-

tography), pp. 391–406.

Sachenbacher, M.; Leucker, M.; Artmeier, A.; Haselmayr, J. (2011): Efficient Energy-Optimal

Routing for Electric Vehicles. In : Proceedings of the Twenty-Fifth AAAI Conference on

Artificial Intelligence and the Twenty-Third Innovative Applications of Artificial Intelli-

gence Conference, 7-11 August 2011, San Francisco, California, USA. Menlo Park, Calif.:

AAAI Press, pp. 1402–1407.

Page 96: Deriving incline for street networks from voluntarily ...€¦ · GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison with the SRTM

6 Conclusion and Outlook

88

Santerre, R.; Pan, L.; Cai, C.; Zhu, J. (2014): Single Point Positioning Using GPS, GLONASS and

BeiDou Satellites. In Positioning 05 (04), pp. 107–114. DOI: 10.4236/pos.2014.54013.

Sester, M.; Jokar Arsanjani, J.; Klammer, R.; Burghardt, D.; Haunert, J.-H. (2014): Integrating and

Generalising Volunteered Geographic Information. In D. Burghardt, C. Duchêne, W.

Mackaness (Eds.): Abstracting Geographic Information in a Data Rich World. Cham:

Springer International Publishing (Lecture Notes in Geoinformation and Cartography),

pp. 119–155.

Shekhar, S.; Zhang, P.; Huang, Y.; Vatsavai, R. R. (2004): Trends in Spatial Data Mining. In H.

Kargupta (Ed.): Data mining. Next generation challenges and future directions. Menlo Park,

Calif., London, Cambridge, Mass.: AAAI Press; Copublished and distributed by MIT Press.

Sui, D. Z. (2008): The wikification of GIS and its consequences: Or Angelina Jolie’s new tattoo

and the future of GIS. In Computers, Environment and Urban Systems 32 (1), pp. 1–5. DOI:

10.1016/j.compenvurbsys.2007.12.001.

Torge, W. (2001): Geodesy. 3rd completely rev. and extended ed. Berlin, New York: W. de

Gruyter.

Tukey, J. W. (1977): Exploratory data analysis. Reading, Mass.: Addison-Wesley Pub. Co

(Addison-Wesley series in behavioral science).

van Winden, K. (2014): Automatically Deriving and Updating Attribute Road Data from Move-

ment Trajectories. Master's Thesis. Delft University of Technology.

Völkel, T.; Weber, G. (2008): RouteCheckr. In S. Harper, A. Barreto (Eds.): the 10th international

ACM SIGACCESS conference. Halifax, Nova Scotia, Canada, p. 185.

Zhang, L.; Thiemann, F.; Sester, M. (2010): Integration of GPS traces with road map. In :

Computational Transportation Science, pp. 17–22.

Zhilin, L.; Qing, Z.; Gold, C. (2005): Digital terrain modeling Principles and methodology. New

York, USA: CRC-Press.