deriving incline for street networks from voluntarily ...€¦ · gps traces, the incline was...
TRANSCRIPT
in cooperation with:
GIScience Group
Institute of Geography
Faculty of Chemistry and Earth Sciences
Methods of Geoinformation Science
Institute of Geodesy and Geoinformation Science
Faculty VI Planning Building Environment
MASTER’S THESIS
Deriving incline for street networks from
voluntarily collected GPS traces
Submitted by: Steffen John
Matriculation number: 343372
Email: [email protected]
Supervisors: Prof. Dr.-Ing. Marc-O. Löwner (TU Berlin)
Dr.-Ing. Stefan Hahmann (Universität Heidelberg)
Submission date: 24.07.2015
ii
Declaration of Authorship
I, Steffen John, declare that this thesis titled, 'Deriving incline for street networks from voluntarily
collected GPS traces’ and the work presented in it are my own. I confirm that:
This work was done wholly or mainly while in candidature for a research degree at this Uni-
versity.
Where any part of this thesis has previously been submitted for a degree or any other qualifi-
cation at this University or any other institution, this has been clearly stated.
Where I have consulted the published work of others, this is always clearly attributed.
Where I have quoted from the work of others, the source is always given. With the exception
of such quotations, this thesis is entirely my own work.
I have acknowledged all main sources of help.
Where the thesis is based on work done by myself jointly with others, I have made clear exact-
ly what was done by others and what I have contributed myself.
Signed:
Date:
iii
Abstract
The knowledge of incline is useful for many use-cases in navigation for electricity-powered vehicles,
cyclists or mobility-restricted people (e.g. wheelchair users). Digital elevation models (DEMs) such as
from laser scanning obtained DEMs or SRTM are either too expensive, not globally available or not
accurate enough. Therefore, voluntarily collected GPS traces collect by users of the OpenStreetMap
project have been used to derive the incline of a street network. Due to a high relative accuracy of the
GPS traces, the incline was computed in a, for many use-cases, reasonable accuracy. The comparison
with the SRTM DEM has shown that the inclines calculated with GPS perform slightly better with a
standard deviation of σGPS = 1.6 % (σSRTM = 3.1 %), considering street with at least 5 GPS traces.
Contrary to SRTM with a full coverage, the incline could only be derived for 18 % of the street
network (> 5 traces).
Kurzfassung (Abstract in German Language)
Steigungsinformationen haben einen Mehrwert für viele Routing Anwendungen, zum Beispiel für das
Routing von elektronisch betriebenen Fahrzeugen, Radfahrenden oder Menschen mit Mobilitätsein-
schränkungen (z.B. Rollstuhlfahrende). Digitale Geländemodelle (DGM), wie durch Laserscanning
erstellte DGMs oder SRTM-1 DGM, sind entweder zu teuer, nicht flächendeckend vorhanden oder
unzureichend in der Auflösung und der Genauigkeit. Daher sollen nutzergenerierte GPS Trajektorien
genutzt werden um die Steigung von Straßen zu berechnen. Aufgrund der festgestellten hohen
relativen Genauigkeit der Trajektorien war es möglich die Steigung in einer für viele Anwendungen
ausreichenden Genauigkeit zu berechnen. Der Vergleich mit dem SRTM DGM hat ergeben, dass die
Steigungen aus GPS Daten mit einer Standardabweichung von σGPS = 1,6 % besser sind
(σSRTM = 3,1 %). Für die Ermittlung der Standardabweichung wurden ausschließlich Straßen mit
mindestens 5 GPS Trajektorien verwendet. Im Gegensatz zu SRTM konnten die Steigungen nicht für
alle, sondern nur für 18 % der Straßen bestimmt werden (mit mehr als 5 Trajektorien).
iv
Table of Content
1 Introduction ................................................................................................... 1
1.1 Motivation .............................................................................................................................. 1
1.2 Objectives .............................................................................................................................. 3
1.3 Outline ................................................................................................................................... 4
2 Background .................................................................................................... 5
2.1 Global Navigation Satellite Systems ..................................................................................... 5
2.1.1 GPS Setup and Determination of Location ................................................................ 5
2.1.2 Error Sources .............................................................................................................. 7
2.1.3 GLONASS, Galileo and Beidou ................................................................................. 9
2.2 Volunteered Geographic Information .................................................................................. 10
2.2.1 Terminology and Nature of VGI .............................................................................. 10
2.2.2 Classification and Examples ..................................................................................... 12
2.3 OpenStreetMap .................................................................................................................... 13
2.3.1 Introduction to Project .............................................................................................. 13
2.3.2 Data Model ............................................................................................................... 15
2.3.3 Incline Information in OpenStreetMap ..................................................................... 16
2.4 Data Mining ......................................................................................................................... 17
3 Related Work ............................................................................................... 19
3.1 3D Routing ........................................................................................................................... 19
3.1.1 Wheelchair routing ................................................................................................... 19
3.1.2 Energy-efficient routing ........................................................................................... 20
3.2 Extraction of Street Attributes from user-generated Movement Trajectories ...................... 21
3.3 Derivation of 3D information, using high-accurate GPS measurements ............................. 22
3.4 Map Matching ...................................................................................................................... 23
3.4.1 Categorization of Map Matching Algorithms ........................................................... 24
3.4.2 Functionality of Selected Algorithms ....................................................................... 24
3.5 Smoothing of Time Series Measurements ........................................................................... 26
4 Methodology ................................................................................................ 28
4.1 Definition of Pilot Region .................................................................................................... 28
4.2 Tools .................................................................................................................................... 29
4.3 Data ...................................................................................................................................... 29
4.3.1 Crowdsourced GPS traces ........................................................................................ 29
4.3.1.1 Platforms and Devices ................................................................................ 30
4.3.1.2 The GPX Format ........................................................................................ 31
v
4.3.1.3 OpenStreetMap GPS traces ........................................................................ 32
4.3.1.4 Typical Errors ............................................................................................. 34
4.3.2 Street Network .......................................................................................................... 35
4.3.3 Land Use Information ............................................................................................... 36
4.3.4 Digital Elevation Models .......................................................................................... 37
4.4 Workflow and Implementation ............................................................................................ 39
4.4.1 Data Import ............................................................................................................... 40
4.4.1.1 GPS traces .................................................................................................. 40
4.4.1.2 OSM Street Network and Land Use Information ....................................... 43
4.4.2 Preprocessing ............................................................................................................ 43
4.4.2.1 GPS data ..................................................................................................... 43
4.4.2.2 Street Network ........................................................................................... 46
4.4.3 Map Matching........................................................................................................... 47
4.4.4 Calculation of Incline ............................................................................................... 52
4.5 Validation ............................................................................................................................. 55
5 Discussion of Results ................................................................................... 57
5.1 Analysis of Crowdsourced GPS traces ................................................................................ 57
5.1.1 Vertical Absolute and Relative Accuracy ................................................................. 57
5.1.1.1 Absolute Accuracy ..................................................................................... 58
5.1.1.2 Relative Accuracy ...................................................................................... 60
5.1.2 Coverage and density ................................................................................................ 62
5.2 Analysis of Calculated Incline ............................................................................................. 65
5.2.1 Exclusion of data from the evaluation ...................................................................... 66
5.2.2 Accuracy of GPS incline .......................................................................................... 66
5.2.2.1 Overall error ............................................................................................... 67
5.2.2.2 By Land Use Classes .................................................................................. 69
5.2.2.3 By Terrain Classes (mountainous / flat) ..................................................... 70
5.2.2.4 Effect of Number of GPS Traces on Overall Accuracy ............................. 71
5.2.3 Comparison GPS incline and SRTM incline ............................................................ 72
5.2.3.1 By Land Use Classes .................................................................................. 73
5.2.3.2 By Terrain Classes ..................................................................................... 75
5.3 Limitations of Approach ...................................................................................................... 75
6 Conclusion and Outlook ............................................................................. 77
6.1 Conclusion ........................................................................................................................... 77
6.2 Outlook ................................................................................................................................ 81
7 Bibliography ................................................................................................ 83
vi
List of Figures
Figure 1: A steep slope of a street or path may be inaccessible for wheelchair users. (© Flickr-
user: ‘Transguyjay’) ................................................................................................................ 1
Figure 2: Depending on the street, 0 to many GPS track points fall into one square of 1’’ × 1’’
equivalent to horizontal resolution of SRTM-1. Due to the projection, the grid is not
squared. (Map: OSM) ............................................................................................................. 3
Figure 3: Determination of a 2D position with three satellites (©Anja Köhn, Michael Wößner) .......... 7
Figure 4: How the satellite constellation influences precision. In (a) the transmitters are
orthogonal, which keeps the error region small. If the transmitters are closer
together, the error region gets larger (b). (Langley 1999) ....................................................... 8
Figure 5: The Multipath and Shadowing effect (Conley et al. 2006, p. 280) .......................................... 9
Figure 6: Density map of OpenStreetMap nodes (© OpenStreetMap wiki-user ‘Tyr’) ........................ 12
Figure 7: OSM data model for map feature (left) and file system for GPX files (right) (adopted
Ramm & Topf 2010, p. 56) ................................................................................................... 16
Figure 8: Map Matching. The GPS trace (blue) is snapped to the street network (red) (Map:
OSM) .................................................................................................................................... 23
Figure 9: Example of 'Median of 3' -smoothing with the raw data (row 1) and the results using
the single median smoothing and the repeated meadian smoothing. (Tukey 1977,
p. 212) ................................................................................................................................... 27
Figure 10: Pilot region Heidelberg / Germany. (Map: OSM) ............................................................... 28
Figure 11: Example GPX file. ............................................................................................................... 31
Figure 12: Screenshot of grid map, shown the number of GPS points per grid cell. ............................ 32
Figure 13: Elevation profile of a GPS trace, recorded on a flat street. .................................................. 34
Figure 14: GPS traces with lost GPS-signals in tunnels. (Map: OSM) ................................................. 34
Figure 15: Difference of DSM and DTM. ............................................................................................. 38
Figure 16: Process of deriving incline information out of user-contributed GPS traces. ...................... 39
Figure 17: Filtering and import of GPS traces. ..................................................................................... 41
Figure 18: The schema of the relation 'gpx_data_line' for storing the GPS traces. ............................... 41
Figure 19: Flowchart of preprocessing the GPS traces ......................................................................... 44
Figure 20: Columns of the relation, which stores the preprocessed GPS traces. .................................. 45
Figure 21: Schema of relation 'streets'. .................................................................................................. 46
Figure 22: Enhancement of street network with land use information in cases, where land use
polygon does not cover the street segment. .......................................................................... 47
Figure 23: Flowchart of map matching process. ................................................................................... 47
vii
Figure 24: The map matching process: Select candidate traces with buffer (light green) of
street (dark green) (a), create profile lines (blue) (b), select traces (red) which
intersect at least 70 % of the profile lines. ............................................................................ 49
Figure 25: Example for two parallel street, which are do not have the same incline. ........................... 49
Figure 26: The tables 'gpx_data_line', 'streets_gpx' and 'streets' and their relation to each other. ........ 50
Figure 27: Properties file of map matching tool .................................................................................... 51
Figure 28: Workflow for calculating the incline of street segments. .................................................... 52
Figure 29: Clipping of assigned GPS traces. ......................................................................................... 53
Figure 30: Screenshot of visualized GPS track points, colorized according to their elevation.
(green=low, red=high) .......................................................................................................... 57
Figure 31: Vertical accuracy of crowdsourced GPS traces, distinguished by land use class. ............... 58
Figure 32: Histogram with the differences of GPS and DTM elevation ............................................... 60
Figure 33: Relative accuracy of crowdsourced GPS track points, overall and distinguished by
land uses. ............................................................................................................................... 61
Figure 34: Map, showing the coverage of the streets with GPS traces. (Map: OSM) ........................... 63
Figure 35: The coverage with GPS traces for different street types. ..................................................... 64
Figure 36: Average distance of two adjacent GPS track points differentiated by street type. .............. 65
Figure 37: Visualization of the GPS incline. Streets with no coverage are not displayed. (Map:
OSM) .................................................................................................................................... 65
Figure 38: Erroneously calculated DTM incline, due to irregularities of the LiDAR DTM. ................ 66
Figure 39: Visualization of the error of GPS incline in the pilot region. (Map: OSM) ......................... 67
Figure 40: Histogram of the overall incline error in percent and the bell-curve (red). ......................... 68
Figure 41: The percentage of streets, with an incline error smaller than 2 % and their share
with respect to the entire street network. .............................................................................. 72
Figure 42: Situations where the calculated incline differs from the steepest incline. ........................... 76
viii
List of Tables
Table 1: Categories of VGI project according to Jokar Arsanjani (2014) ............................................. 12
Table 2: Usage of the key 'incline' and its values .................................................................................. 16
Table 3: Overview of visibility options for the upload of GPS traces................................................... 33
Table 4: Values of highway tag and their share of length in percent. ................................................... 35
Table 5: OSM landuse-tags and their characteristics. ........................................................................... 37
Table 6: The effect of the relative accuracy on the calculated incline. ................................................. 62
Table 7: The length of street segments for different incline error classes. ............................................ 69
Table 8: The achieved accuracy of GPS incline differentiated by land use classes. ............................. 70
Table 9: The achieved accuracy of GPS incline differentiated by terrain classes. ................................ 71
Table 10: Comparison of SRTM and GPS incline in terms of amount of street network with an
incline error smaller than 2 %. .............................................................................................. 73
Table 11: Comparison of the standard deviations of the incline error, overall and differentiated
by land use classes. ............................................................................................................... 74
Table 12: Comparison of the standard deviations of the incline, overall and differentiated by
terrain classes. ....................................................................................................................... 75
1 Introduction
1
1 Introduction
1.1 Motivation
Common routing and navigation systems such as Google Maps1 or Here
2 do not consider elevation
or incline information in the calculation of directions. This is due to the fact that they were initially
designed for the calculation of directions for cars or other fuel-powered vehicles, which do
generally not benefit of incline information. Many other use-cases exist, in which one would
appreciate the knowledge about the incline of streets. For cyclists, pedestrians and especially for
mobility-restricted people the incline of a planned route is of high relevance (cf. Figure 1). Some
cyclists might prefer to take a slightly longer but less steep route, while other cyclists may prefer
inclined streets due to training reasons. Even more relevant is the incline information for mobility-
restricted people such as wheelchair users, people with walking aids or parents with push-chairs.
For this group of people steep streets or paths may be inaccessible, since they can only pass
inclines up to a certain percentage uphill or downhill. Obviously, the magnitude of incline which
can be passed by people with walking-aids highly depends on the disability and the type of
wheelchair (manual / electric). Moreover, electric wheelchairs or in general electricity powered
vehicles have a higher energy demand when going uphill and a limited battery capacity. In
addition, charging stations are still rare. Therefore, incline information can be utilized by routing
services to compute the most efficient route in terms of power consumption (cf. Franke et al. 2012).
Figure 1: A steep slope of a street or path may be inaccessible for wheelchair users. (© Flickr-user:
‘Transguyjay’3)
The GIScience Research Group of the University of Heidelberg (and other partners) is currently
working on a project to extend and improve the OSM routing service OpenRouteService.org to
include accessibility related data, out of which the motivation for this thesis arose. The EU-project
1 http://google.de/maps, checked on 15/07/2015
2 http://here.com, checked on 15/07/2015
3 Source of image: https://www.flickr.com/photos/jayw/2604877785, checked on 15/07/2015
1 Introduction
2
is called CAP4Acess which is an acronym for ‘Collective Awareness Platforms for Improving
Accessibility in European Cities and Regions’. Due to this project the cooperation between the
Institute of Technology Berlin (Technische Universität Berlin) and the University of Heidelberg
was established for this thesis. The aim of the project is to develop methods and tools for collec-
tively gathering and sharing information about the accessibility of public spaces. The project
focuses on different topics, including for example “Collective tagging”, “Participatory sensing” and
“Routing and navigation”. There are four pilot regions for this project in Vienna (Austria), London
(UK), Elche (Spain) and Heidelberg (Germany) (cf. empirica Gesellschaft für Kommunikations-
und Technologieforschung mbH 2015).
For the calculation of the incline of a street, different types of digital elevation models (DEMs)
may be used. The most accurate possibility is using a high resolution DEM, acquired from airborne
laser detection and ranging (LiDAR). This method is very expensive, therefore, open-licensed
DEMs may be an alternative. Depending on the Open-Data strategy of the authorities, high-
resolution DEMs are available for some regions (cf. OpenStreetMap Wiki 2015d). There are also
open-licensed DEMs which are almost globally available, SRTM and ASTER GDEM. The SRTM
DEM is acquired from the Shuttle Radar Topography Mission (SRTM) and is available online with
a horizontal resolution of 1 arc second (30 m) and an absolute elevation error of 6.2 m (cf. Farr et
al. 2007). The ASTER Global DEM was compiled from data collected by the ‘Advanced Space-
borne Thermal Emission and Reflection Radiometer’ (ASTER), mounted on the Terra spacecraft.
The global DEM has a horizontal resolution of 30 m (1 arc second) and a vertical accuracy of
approximately 9 m (cf. Meyer 2011).
Open-licensed DEMs with high accuracy are not globally available, whereas those DEMs which
are nearly globally available suffer from a poor horizontal resolution and vertical accuracy.
Especially for hilly or mountainous regions and high-resolution scenarios this data might not be
sufficient to derive the incline of streets with an acceptable accuracy. Therefore, I propose a
method to derive incline information from GPS traces, contributed by users of the OpenStreetMap
project.
Due to the fast development of mobile phones with integrated GPS receivers, GPS traces can easily
be recorded by everybody. According to Liu et al. (2014), a positional accuracy of 5 to 10 meters
and a vertical accuracy of up to 25 meters can be expected from GPS traces collected by handheld
GPS devices or smartphones. Admittedly, the vertical accuracy is very poor, however, for
calculating the incline only the elevation differences of two adjacent points are relevant. If a
GPS trace was recorded within a short time span in an open area it may be assumed that all points
of the trace are recorded under similar atmospheric influences and with a similar satellite constella-
tion. Therefore, it may be expected that the track points of one GPS trace have a similar absolute
1 Introduction
3
error and consequently a fairly good relative accuracy. In addition, the coverage of multiple
GPS traces per street as well as a relative high density of GPS track points, may compensate a poor
accuracy. Figure 2 shows the grid of 1 by 1 arc second which is equivalent to the horizontal
resolution of SRTM DEM as well as the GPS track points extracted from the OpenStreetMap GPS
data. It can be seen that many GPS track points fall into one square, for most of streets, but there
are also some streets with none or just a few GPS track points.
Figure 2: Depending on the street, 0 to many GPS track points fall into one square of 1’’ × 1’’ equivalent to
horizontal resolution of SRTM-1. Due to the projection, the grid is not squared. (Map: OSM)
1.2 Objectives
The objective of this thesis is to develop methods and tools to calculate the incline of a street
network, including paths for pedestrians and cyclists. The incline shall be calculated out of
GPS traces which are collected by contributors the OpenStreetMap project, since this may
represent a low-cost alternative to expensive high-accuracy DEMs. The GPS data is a collection of
GPS traces, collected by thousand users with different devices and transportation mode. The device
and transportation is not given in the data, which makes it difficult to judge the accuracy and
density of GPS track points. Therefore, the GPS raw data shall be assessed with regard to accuracy,
coverage and density. Furthermore, the incline calculated from GPS traces shall be validated, using
a high-accuracy DEM, obtained from LiDAR measurements, to see how accurate the incline was
calculated. As described in the motivation, globally available DEMs represent, also represent as
alternative to high-accuracy DEMs for deriving incline. Thus, the incline, calculated from
GPS traces shall also be compared to the incline derived from the SRTM-1 DEM. It is intended that
due to a higher density of elevation information, a higher accuracy will be achieved with user-
generated GPS traces. The tools developed for the purpose of this thesis, shall be published and
provided to the OpenStreetMap community, since tools for processing GPS data are still rare.
1 Introduction
4
To summarize, the aims of this thesis can shortly be formulated as follows:
- Creation and implementation of a workflow to calculate the incline of streets, using user-
contributed GPS traces.
- Assessment of the quality of voluntary collected GPS traces in terms of
o vertical accuracy (absolute and relative)
o coverage of GPS traces
- Assessment of the achieved quality of the incline information, compared to LiDAR and
SRTM-1 DEM.
- Publication of developed software as Open Source and provision to the OpenStreetMap
community
1.3 Outline
The thesis will be structured as follows. In chapter 2 background information, which are important
this topic, will be discussed. This involves the topics, Global Navigation Satellite Systems,
Volunteered Geographic Information, the OpenStreetMap project and data mining. In chapter 3,
different researches about related topics are presented. The methodology of this research is
described in chapter 4, which includes the used data and tools as well as all the steps of deriving
incline from user-generated GPS traces. The outcome of the methods, described in chapter 4, will
in chapter 5 be judged and discussed using statistical methods under the consideration of a high-
accuracy DTM on the one hand and the low-cost alternative SRTM-1 DEM on the other hand.
Furthermore, chapter 5 includes the quality assessment of the GPS-data. In chapter 6 this thesis will
be summarized and concluded and ideas on how to progress with this topic in the future will be
given.
2 Background
5
2 Background
In the following chapter background information related to this research shall be given. Firstly, the
Global Positioning System and other Global Navigation Satellite Systems will be introduced and
their functionality explained, since GPS traces are one of the major data sources of this research.
The traces are collected voluntarily; therefore an overview of volunteered geographic information
VGI is given. After that, one of the most popular VGI projects, OpenStreetMap, will be introduced.
At the end of this chapter, the terms data mining and spatial data mining will be discussed.
2.1 Global Navigation Satellite Systems
Nowadays, Global Navigation Satellite Systems (GNSS) are an essential part in the field of
navigation and positioning. With GNSS it is possible to determine any location on the earth’s
surface with fairly good accuracy. Since this research is about mining information from movement
trajectories recorded with the help of such systems, an overview of which systems exist and how
they work shall be given. Several countries either operate a GNSS or are currently building one,
however this section shall cover the set up and functionality of the Global Positioning System
(GPS) only, since this is the first and most stable GNSS. GPS is the GNSS of the United States of
America. Furthermore, the method for determining a location will be explained, followed by an
overview of error sources and their impact on the accuracy. An overview of other GNSS is given at
the end in section 2.1.3.
2.1.1 GPS Setup and Determination of Location
The Global Positioning System (GPS) is developed and operated by the U.S. Department of
Defense. It was initially developed for military reasons, and in the beginning the accuracy was
degraded for civilian use. This is known as Selective Availability (SA). In 2000, the degradation of
accuracy was switched off, which now offers higher accuracy to civilian users. This enabled the
realization of many standard applications, such as the private use of so-called Location-Based-
Services (cf. Hofmann-Wellenhof et al. 2008, pp. 309–311).
For the GPS set up, three segments play an important role. These are the space segment, the control
segment and the user segment. The space segment consists of 24 active and several spare satellites.
The active satellites are spaced in six orbits with an altitude of 20,200 km. The control segment
consists of several control stations distributed around the earth. The tasks of the control segment
are, among others, to track the satellites for the determination of their orbit and the synchronization
of the atomic clocks, mounted on the satellites. The user segment is referred to as the receiver,
which receives the signals emitted by the satellites and calculates the current location. Further
2 Background
6
information on the GPS segments can be taken from Hofmann-Wellenhof et al. (2008, pp. 322–
327).
As already mentioned, the position of the satellites at a certain time is known through the orbital
parameters, which are observed by the control segment. The satellites are constantly emitting
signals which can then be received by the receiver. The signal contains two carrier waves, L1 with
a frequency of 1575.42 MHz and L2 with 1227.60 MHz. Upon both waves, codes are modulated
which represents a message containing the information about the satellite such as orbit parameters
and time of signal emission. While on L1 both the C/A-code (coarse/acquisition) and P-code
(precision) are modulated, L2 only carries the P-code. The combination of the P-code from two
carrier waves allows a higher positional accuracy through the elimination of ionospheric influ-
ences. In addition, the P-code is encrypted to ensure that it is only available for authorized users,
like the military (cf. Hofmann-Wellenhof et al. 2008, pp. 315-322)
Hofmann-Wellenhof et al. (2008, pp. 161-191) mention three mathematical models for position-
ing. These are single point positioning, differential positioning and relative positioning. Single
point positioning and differential positioning will be described below. Relative positioning is not
relevant for this research. For further information on this, see Hofmann-Wellenhof et al. (2008,
pp. 173-191). Single point positioning is applied when determining the position using smartphones
or other handheld devices. When using this method, the pseudoranges between the satellites and the
receiver are determined. This can be done by either using the code modulated on the carrier waves,
using the phase of the carrier wave, or based on Doppler data. Here, only the first approach will be
explained, since common smartphones or other handheld devices make use of the code. To
determine the 2D position (X,Y) of a location, the pseudoranges of at least three satellites are
necessary. Two of them are used to calculate the distance between satellites and receiver by
multiplying the time of travel by the speed of light. To determine the time of travel, it is necessary
for the receiver’s clock to be synchronized with the satellite clock. Since this is not the case prior to
measuring, the pseudorange to the third satellite must be known in order to correct the clock bias.
This is depicted in Figure 3. The solid lines show the pseudorange to the satellites before the clock
correction. It can be seen that there are three intersection points which are possible locations for the
receiver, depicted as ‘B’. After the correction of the clock and the correction of the pseudoranges
(dashed line) which follows, only one intersection point remains (A).
2 Background
7
Figure 3: Determination of a 2D position with three satellites (©Anja Köhn, Michael Wößner4)
When determining a 3D position (X,Y,Z), four instead of three satellites are necessary, since there
is one more unknown. Considering all measurements, a non-linear equation system as shown in
Hofmann-Wellenhof et al. (2008, p. 162) can be solved to determine the unknown coordinates of
the receiver’s location. According to Cosentino et al. (2006, p. 379), a horizontal accuracy of
around 10 m can be achieved in 95 % of cases when applying single point positioning with one
frequency. This is because the measurements are influenced by several factors which will be
discussed in section 2.1.2. To omit some of the influences and thereby improve the accuracy of the
measurement, differential point positioning may be performed. To do so, two receivers are needed,
a reference receiver and a remote receiver. The coordinates of the reference station are known and
considered as true value. Consequently, the error of the observed pseudoranges can be determined.
The observed error can then be transmitted to the remote receivers and will be used to correct the
pseudoranges (cf. Hofmann-Wellenhof et al. 2008, p. 169).
2.1.2 Error Sources
As already mentioned the measurements are influenced by errors arising from different sources.
Hofmann-Wellenhof et al. (2008) categorize the errors according to their sources, namely satellite,
signal propagation and receiver.
Satellite:
Errors originating from the satellite are the satellite clock bias and orbital errors. The highly-
accurate atomic clocks of the satellites are controlled and frequently updated by the control
segment on earth in order to synchronize the satellites among themselves. The clocks get an update
once a day, and therefore the clock error is small immediately following the update and increases
until the next update. In addition to the clock biases, errors are also contained in the ephemeris
4 Source of image: http://www.kowoma.de/gps/Positionsbestimmung.htm, checked on 15/07/2015
2 Background
8
data, transmitted to the receiver. The ephemeris data contains information about the satellite’s orbit
and is used to calculate the satellite’s position. The orbital parameters are estimated and may differ
from the actual orbit of the satellite (cf. Conley et al. 2006, pp. 304 f.).
In addition to the aforementioned error sources originating from the satellite, precision also
depends on the satellite constellation in the sky. The effect is shown in Figure 4 in the case of a
simple ranging system with two transmitters. When the rays of the two transmitters (satellites) have
an intersection of 90 °, the region in which the receiver may lie is relatively small (Figure 4a). If
the transmitters are closer together as shown in Figure 4b, the region becomes larger and with it the
uncertainty of the location determination. This is the reason why the vertical accuracy is generally
worse than the horizontal one. In case of the horizontal coordinates, the satellites may be in good
constellation, meaning that there is a satellite in every direction, keeping the error region small. In
case of the vertical coordinate, all satellites are above the receiver and therefore only in one
direction. (cf. Langley 1999)
Figure 4: How the satellite constellation influences precision. In (a) the transmitters are orthogonal, which keeps
the error region small. If the transmitters are closer together, the error region gets larger (b). (Langley 1999)
Signal Propagation:
During the propagation of signals through the atmosphere, a delay occurs. The ionosphere, which is
the layer from approximately 50 km to 1000 km above the earth, is a dispersive medium. The
dispersion is dependent on the frequency. Thus, it is possible to correct the ionospheric influences
when applying a dual-frequency single point positioning. The different frequencies L1 and L2 (cf.
2.1.1) have a different delay, or in other words, a different propagation speed. A correction can
therefore be determined (cf. Conley et al. 2006, p. 161).
2 Background
9
Receiver:
Errors caused on the receiver side are, among others, the multipath effect and shadowing. Both are
depicted in Figure 5. The multipath effect occurs when in addition to the direct signals, reflected
signals from the surfaces of nearby structures are also captured by the receiver. This leads to errors
in the calculation of the pseudorange, since the reflected signal traveled a longer way and conse-
quently took more time. Shadowing occurs when the view from the receiver to the satellite is
shadowed by trees or roofs. As a result, the signal reaches the receiver either with low energy or
not at all and cannot be used for positioning. Multipath and shadowing can also occur in combina-
tion, as shown in Figure 5. The signal reflected on the building is received with higher energy than
the signal shadowed by the canopy. Multipath and Shadowing effects are random and highly
dependent on the time and the receiver’s location. The error caused by these effects can be high in
magnitude and can sometimes be the main contributor to the error in comparison to the other error
sources (cf. Conley et al. 2006, pp. 279-280). User-generated GPS traces are recorded without
consideration of such effects, also on location where multipath and shadowing effects have a big
share of the error. This is mainly the case in urban areas with high buildings or in forested areas.
Figure 5: The Multipath and Shadowing effect (Conley et al. 2006, p. 280)
2.1.3 GLONASS, Galileo and Beidou
In addition to GPS, other GNSS worth considering include, GLONASS, Galileo and Beidou.
GLONASS is operated by Russia and is, like GPS, fully operational with 21 active and 3 spare
satellites in three orbital planes (cf. Hofmann-Wellenhof et al. 2008, pp. 348-349). Since 1996,
when GLONASS was fully operational the first time, several satellites failed over the years and
GLONASS could not be operated with the full coverage. Several new satellites were launched, but
this was not enough to maintain the full constellation (Feairheller & Clark 2006). Nowadays,
GLONASS is again fully operational. Galileo and Beidou are still under construction and do not
have world-wide coverage yet. While Galileo, the European answer to GPS and GLONASS, has
2 Background
10
only four satellites launched, the Chinese Beidou consists of 14 satellites and already operates in
the Asian-Pacific regions. It is planned that Beidou will reach its full constellation with 35 satellites
in different orbits by 2020. Once this happens, it will then have a world-wide coverage, similar to
GPS and GLONASS (Santerre et al. 2014). Galileo is a project of the European Union and
European Space Agency to build a GNSS, similar to GPS and GLONASS, but under civilian
control. The development of the system was initiated in 1994. The first idea was to cooperate with
the United States to develop a “next-generation” GPS, however the United States did not wish to
cooperate with foreign countries. Therefore, it was decided to build up a new and independent
GNSS, which is interoperable with the existing GPS system and GLONASS. Consequently,
receivers could use the three systems in combination. Moving forward, it will be possible to
achieve higher accuracies, since more satellites are involved in the positioning process (cf.
Hofmann-Wellenhof et al. 2008, pp. 365-367). Galileo is currently still under construction and it is
planned to be fully operational by 2020. Two satellites were launched in both 2011 and 2012. With
a total number of four satellites, first tests of the system were then possible. When the system is
fully operational, a total of 30 satellites will be orbiting around the earth at an altitude of
23,222 km. Out of the 30 satellites, 27 will be actively used and three will be available as replace-
ments (European Space Agency 2015).
2.2 Volunteered Geographic Information
For this research voluntarily collected GPS and street level data is used. For this special type of
data the term ‘Volunteered Geographic Information” has emerged (Goodchild 2007). It describes a
special case of user-generated content (UGC). According to Bauer (2010) the term UGC has been
used since the mid-nineties for content in the internet, which is produced by the user. Due to the
fast development of technologies regarding the internet, it has become possible and affordable for
many users to have fast internet access. This development has made it possible for the user not only
to search the internet, but also to create new content. The term UGC is kept general intentionally,
since it may be any kind of media such as videos, pictures or text. When this data refers to a spatial
location, it is known as ‘Volunteered Geographic Information’. The terminology and characteristics
of VGI is discussed below, and examples of how VGI can be classified into groups are provided.
2.2.1 Terminology and Nature of VGI
The term ‘volunteered geographic information’ (VGI) was introduced by Goodchild in 2007. It
describes a phenomenon which was new in the field of geography at the time. Geographic
information is collected voluntarily by mostly untrained people without any financial compensa-
tion. He also calls this phenomenon “citizens as sensors”.
2 Background
11
It has also been referred to as ‘crowdsourcing geospatial information’ by Heipke (2010) and Ramm
et al. (2011). Sui (2008) describes the recent development as the ‘wikification of GIS’ and points
out, that the actors and methods of collecting geographic information has changed. Preciousy, only
experts like surveyors or cartographers were acquired and processed geodata, which was expensive.
Nowadays, there is a large amount of data freely available and the people who are acquiring and
processing the data are not necessarily experts anymore.
Resch (2013) distinguishes between different concepts of acquiring the data. In his paper, he
discusses the difference between the terms ‘citizen as sensor’, ‘collective sensing’ and ‘citizen
science’. While these concepts are closely related, there are differences worth noting. ‘People as
sensors’ describes the concept of people who collect information through subjective observations.
This might, for example, be the smoothness of a street surface or the water quality of lakes. The
term ‘collective sensing’ ”[…] analyses anonymized data coming from collective networks, such as
Flickr, Twitter, Foursquare or the mobile phone network” (Resch 2013). The third term, ‘citizen
science’, means that people contribute data, collected by sensors integrated in their smartphone or
other devices. In comparison to “people as sensors” this data is not subjective and only comes from
sensor measurements.
VGI is often collected with the help of low-cost GPS receivers, integrated in most smartphones or
other handheld GPS devices. With those devices the coordinates of a location can easily be
determined. The acquired coordinates or GPS traces may then be used, for example, to digitize the
outline of a street traveled or to mark points of interest at the measured location. Images may also
be georeferenced by adding coordinates. Another way of gathering information is digitizing
features from satellite imagery. (cf. Goodchild 2007)
Sester et al. (2014) provide an overview of characteristics of VGI. Volunteered Geographic
Information can be highly heterogeneous in terms of the quality and coverage. Depending on the
number of volunteers, some regions may be more complete than others. Figure 6 shows the density
map depicting the nodes available in OpenStreetMap, the most famous VGI-project. The brighter
the color is, the more nodes within that pixel. It can be seen, that developed regions with a high
population density like Europe and North America have more nodes than others. This may be
attributes to the fact that there are more people living in these places who are potential contributors,
while also considering that there may be more features to digitize. If there are more volunteers in a
region, the data is also more likely to be up-to-date. Especially in comparison to authoritative data
which is usually updated in certain cycles, VGI is updated whenever a volunteer detects a change
in the real world. Another characteristic of VGI is the heterogeneity with respect to semantic
information. In particular, when collecting topographic data in OpenStreetMap, there is no
standardized catalogue of features and their semantic information. The semantic information is
2 Background
12
added as key-value pairs, which are commonly discussed in the community, however, in practice a
user does not need to follow these agreements.
Figure 6: Density map of OpenStreetMap nodes (© OpenStreetMap wiki-user ‘Tyr’5)
2.2.2 Classification and Examples
There are plenty of projects which somehow deal with geographic information. Jokar Arsanjani
(2014) categorized the projects according to the purpose or type of data (topographic, images,
video, text) which is being shared. The categories are listed in Table 1.
World mapping projects Weather mapping Business mapping
Social media mapping Crisis and disaster mapping Transportation mapping
Environmental and ecological
monitoring Outdoor activity mapping Crime mapping and tracking
Table 1: Categories of VGI project according to Jokar Arsanjani (2014)
Hahmann (2014) extends this list with ‘encyclopedic projects’ such as Wikipedia6. For this thesis,
the most popular world mapping project, OpenStreetMap7, is of relevance as a data source for
street network and GPS traces. OpenStreetMap is about creating a world map from volunteers
under an open license. Here the contributors generate map features by digitizing recorded
GPS traces or satellite imagery. Next to digitized topographic data, raw GPS traces are also
collected within this project. In section 2.3 this project will be explained in more detail. Other
5 Source of image: http://wiki.openstreetmap.org/wiki/File:OSM-node-density-map-2013.png, checked on
15/07/2015 6 http://wikipedia.org
7 http://openstreetmap.org
2 Background
13
‘world mapping projects’ are, among others, Wikimapia, which similar to OpenStreetMap, aims to
mark geographical objects, and Google Map Maker8, which is operated by Google and was
initiated to improve the quality of Google Maps9.
Furthermore, ‘outdoor activity mapping’ projects, such as WikiLoc10
or GPSIES11
shall be
mentioned. These projects aim to collect GPS traces of outdoor activities undertaken by the
contributors. The purpose is to provide outdoor routes including additional information, such as
points of interest, distance or elevation profile. In addition, traces may be rated and can therefore be
used to search good outdoor routes, depending on the intention of the user. Those projects are a
potential data source of GPS traces for the derivation of incline values.
2.3 OpenStreetMap
This chapter briefly introduces the VGI-project OpenStreetMap, since the data of this project is
used for this research. Firstly, the project will be introduced in general and a short history will be
given. Secondly, how the geographic and semantic information is handled within this project will
be explained, followed by an assessment of how incline can be represented and how often it is
actually mapped.
2.3.1 Introduction to Project
The OpenStreetMap (OSM) project was founded by Steve Coast at University College London in
2004 and aims to create a freely and globally available map. Information like map features
including their semantic information are added and modified by the community. The way of
contributing data is typical for a VGI-project. In the first years of the project, data was exclusively
contributed by capturing the travelled path with a GPS-device, followed by the digitization of the
recorded route on a computer. For editing the maps, several editors are available, such as JOSM or
iD. These editors make it very convenient to load the GPS raw data and create geometries.
Furthermore, the editors handle the upload of the created features to the OSM database which
follows. In addition to the vector geometry of the map feature, the GPS raw data can also be
uploaded (Ramm & Topf 2010, pp. 3 f.). In 2007, the company Yahoo! allowed OSM-contributors
to use their aerial imagery for the digitization of map features. With the satellite imagery, it became
very easy to create features such as buildings, which are hard to measure using a handheld GPS
device. This also enables contributors to create features remotely, without being on-site or having
any local knowledge about the region (Haklay & Weber 2008). Three years later, in 2010,
Microsoft also provided their aerial imagery for the purpose of contributing to OpenStreetMap
8 https://google.com/mapmaker
9 http://google.com/maps
10 http://www.wikiloc.com/
11 http://www.gpsies.com/
2 Background
14
(OpenStreetMap Wiki 2015a). Other sources of data include donations from public agencies and
the integration of other open data. An example is the integration of the entire street network of the
Netherlands, after the donation of the company AND (Automotive Navigation Data) in 2007.
The data, stored in the OSM database is licensed under the Open Database License (ODbL). It
allows everybody to share the data, produce their own work from it and redistribute it, as long as
the new database is also published under ODbL and OSM and its contributors are attributed. (Open
Knowledge Foundation 2015). The OSM database has not always been under ODbL. From the
beginning of the project until September 2012, the data was licensed under the terms of the
Creative Commons – Attribution-ShareAlike (CC-BY-SA) license. CC-BY-SA was made for
creative works, such as music and pictures. Therefore, it could hardly be used for collections of
data or databases such as OpenStreetMap, since the mentioned terms are hard to interpret for
databases. Furthermore, it was not possible to mix data under CC-BY-SA with data under other
licenses. This was made possible with the change to ODbL. (OpenStreetMap Foundation Wiki
2015b) The process of changing the license was initiated by the OpenStreetMap Foundation. It is a
non-profit organization, founded in Great Britain, to support the OSM project with organizational
tasks. Among other things, the foundation hosts the OpenStreetMap servers, helps with collecting
donations for servers and supports the community with organizing events, like so-called mapping
parties or conferences. The OpenStreetMap Foundation also organizes working groups which act as
support in specific fields or topics. An Example is the Operations Working Group, which is
responsible for issues related to servers and the OSM API. (cf. OpenStreetMap Foundation Wiki
2015a, 2015c)
Over the years from the beginning of the project in 2004 to now, the OSM project has become
more and more popular. Haklay & Weber (2008) say that “[…] OpenStreetMap (OSM) is probably
the most extensive and effective project currently under development”. There are now over 1.9
million users registered, which contributed over 2.75 billion nodes and over 250 million line
objects. Worldwide, the users uploaded GPS traces providing approximately 4.5 billion track points
(OpenStreetMap Wiki 2015c).
As is typical for a VGI project the quality and completeness of such data may vary. Neis et al.
(2012) evaluated the OSM street network of Germany from 2007 to 2011 using a dataset of a
commercial provider. If only taking streets into account, which can be used for car navigation
(name or route number of streets is known), the street network of OSM is 9 % smaller than the data
from the commercial provider. But if the entire street network is considered, the OSM dataset is
27 % larger or even 31 % larger, if paths for pedestrians are considered. This means there are
streets or paths in OSM which do not exist in the commercial data set. The reason for this can be
found in the fact that OSM contains small hiking trails, paths or bicycle lanes, which are not
2 Background
15
relevant for the commercial map provider. Since this evaluation was made at the time of writing
four years ago, it can be expected that the completeness of the street network has now improved
even further. This shows the potential of the OpenStreetMap dataset and proves its suitability for
many applications. According to Neis & Zielstra (2014b) it has been proven in the past that the
OSM data can be used for various applications such as crisis management, mapping for different
purposes (hiking, public transport) or routing. There are several routing services for different
purposes, such as komoot12
for cycling and hiking or Skobbler13
for car navigation.
2.3.2 Data Model
As mentioned in section 2.3.1, the OpenStreetMap project stores both the digitized map features
and the GPS traces as raw data. While the map features are underlying a data model, the GPS traces
are stored in a file system as GPX-files14
(cf. Ramm & Topf 2010, pp. 317–318). The data model of
the map features follows a simple approach. Figure 7 shows the GPX file system next to the object
types available in OpenStreetMap and their relations to each other. A node is a representation of a
point on the earth’s surface, described by longitude and latitude. Ways are the representation of
linear features. Instead of being defined by a sequence of coordinates, a way object references up to
2000, but at least two, ordered nodes. Since the referenced nodes are ordered, the way is directed
from the first to the last node. If a line is closed (starting point is equal to the end point) the way
can, but must not necessarily be considered a polygon. The third object type is a relation. Several
nodes, ways and/or other relations can be referenced by a relation object. This is done when
different objects are somehow related to each other. This is the case, for example, when a number
of ways define the route of a bus within a city. Semantic information is added to all objects by
defining a certain number of tags. A tag is a key-value-pair, separated by ‘=’. A tree, for example,
would have the tag ‘natural=tree’. Tags may have any combination of key and value, however, for
consistency reasons, the community agreed on a list of tags15
, which should be used for mapping.
This concept of adding semantic information to a geometry object has the advantage that new tags
can always be introduced, if required for special use-cases (cf. Ramm & Topf 2010, pp. 55-59).
12
http://komoot.de, checked on 15/07/2015 13
http://skobbler.de, checked on 15/07/2015 14
GPX is the data format, based on XML, for exchanging GPS-traces. 15
http://wiki.openstreetmap.org/wiki/Map_Features, checked on 15/07/2015
2 Background
16
Figure 7: OSM data model for map feature (left) and file system for GPX files (right) (adopted Ramm & Topf
2010, p. 56)
2.3.3 Incline Information in OpenStreetMap
The information about the incline can easily be added to ways as semantic information. According
to the list of proposed OSM map features as mentioned in 2.3.2, the key ‘incline’ should be used
when adding this information to a street or a path. The corresponding value of the tag is the actual
incline value, given in percent or degrees. Since, there are two possible units it must be indicated
with ° or %16
. Positive or negative values indicate if the way is inclined up- or downwards,
depending on the direction of the way. When the inclined part of a street does not cover the entire
street segment, the street should be split at the start and end of the inclined part. Furthermore, it is
recommended that the steepest incline along the path shall be added as a value. If the exact incline
value is unknown, but it is visible that the street is inclined, the value of the key ‘incline’ can also
be ‘up’ or ‘down’ (cf. OpenStreetMap Wiki 2015e)
value of key ‘incline’ share of OSM ‘highway’ features
with incline information
‘up’ 44.3 %
‘down’ 30.8 %
others 24.9 %
Table 2: Usage of the key 'incline' and its values17
16
Examples are: ‘incline=6%‘, ‘incline=8°’, ‘incline=up’, ‘incline=down’ 17
source: https://taginfo.openstreetmap.org, checked on 15/07/2015
2 Background
17
Out of over 83 billion (83,299,544) OSM features tagged with ‘highway’, only 0.2 % (169 121)
have information about the incline. This also includes also all paths, such as footpaths or bicycle
lanes. From Table 2 it can be seen that out of the 0.2 %, the main part (~ 75 %) has the value ‘up’
resp. ‘down’, giving only information that the path is inclined, but not to what extent. For the other
25 %, the incline is mainly more specifically defined in percent or degree, however, a few are also
described with words such as ‘moderate’ or ‘extremely steep’. To summarize, it can be said that
there is hardly any information present about the incline of paths in OSM, and if so, the infor-
mation is not very specific. This has several possible reasons. On the one hand, it may be difficult
to attract the contributor’s attention to a tag which will not be displayed on the map. On the other
hand, the incline cannot be digitized from GPS traces or aerial imagery, like other features. It
somehow needs to be measured or estimated with the use of special tools like measuring tape,
inclinometer or the smartphone with an in-built gyroscope. Measuring the incline is therefore time-
consuming and the contributors have to be on-site, since the incline cannot be mapped using simple
methods.
2.4 Data Mining
The aim of this thesis is to gain or extract information out of a vast amount of data. Such a process
is generally referred to as ‘Data Mining’. Next to data mining, the term knowledge discovery from
data is used in academic literature. While sometimes both terms are considered synonymous, data
mining can also be seen as one step in the process of knowledge discovery of data (KDD), as in
Fayyad et al. (1996) . He describes the steps of the process of KDD as data selection, prepro-
cessing, transformation, data mining and interpretation and evaluation of the results. Data mining in
this process chain of KDD refers to “[…] applying data analysis and discovery algorithms that
produce a particular enumeration of patterns […] over the data”. Therefore, KDD is an iterative
process in which any two steps can also involve iterations. According to Han & Kamber (2006)
many fields are considering the terms data mining and KDD as synonyms, probably because data
mining is much shorter. Hence, he defines data mining as follows:
“Data mining is the process of discovering interesting patterns and knowledge from large
amounts of data. The data sources can include databases, data warehouses, the web, other
information repositories, or data that are streamed into the system dynamically.”
Consequently, for this thesis both terms are used synonymously as the entire process of gaining
knowledge, including all steps as mentioned in Fayyad et al. (1996).
2 Background
18
Data mining can be applied to any kind of data, such as information about books in a library, data
about customers (personal data or transactions), search engine queries or user-generated content of
different online communities such as Facebook, Instagram or OpenStreetMap (cf. section 2.2). The
field of data mining has grown out of the need to handle the data, after devices and methods were
developed to capture and store data of this amount. It comprises different methods and techniques,
such as detection of patterns and acquiring knowledge about the association and correlation of a
data collection. A collection of data is therefore always needed, since such information cannot be
obtained from a single record (cf. Han & Kamber 2006, pp. 5-7).
For spatial or geographic data special techniques and methods were developed and the field of
spatial data mining has emerged. Shekhar et al. (2004) argues that spatial data is not compatible
with regular data mining techniques, due to the complexity and intrinsic spatial relationships.
Spatial data mining uses techniques and methods from the field of spatial analysis as well as the
field of general data mining, as mentioned in the paragraph above. Mennis & Guo (2009) review
commonly used methods and techniques. The spatial classification can be divided into supervised
and unsupervised classification. Different objects are grouped into classes based on its properties.
Contrary to the unsupervised classification, which is also known as clustering, the supervised
classification needs a training dataset to detect the members of a group. An unsupervised method is
spatial clustering, where points are classified according to their spatial location. Spatial classifica-
tion methods generally consider neighboring objects, while this is not undertaken in general
classification methods. Another method, commonly used in spatial data mining, is the point pattern
analysis. It is also a clustering method and tries to extract areas in which an unusual amount of
events occur. An example is the detection of streets where accidents occur more often than on other
streets. This method is also known as Hot Spot Analysis. Further information on clustering
methods can be found Mennis & Guo (2009).
3 Related Work
19
3 Related Work
The chapter describes what applications are in need of incline information and which research was
done with regard to mining information out of user-generated GPS traces. Furthermore, research
related to the extraction of 3D information out of GPS data will be reviewed. For mining street
information out of GPS data it is essential to know, on which street the traces were recorded. This
process is formally known as Map Matching and, after a short review of different types of
algorithms, two of them are explained in more detail. At the end of this chapter, different methods
for smoothing time series measurements are reviewed. This will be an essential step in prepro-
cessing the GPS data.
3.1 3D Routing
There are several routing applications which rely on elevation information such as routing for sport
activities, wheelchair routing or energy-efficient routing for electric-powered vehicles (e.g. E-cars,
Pedelecs18
or electric wheelchairs). In the following section, different projects related to this topic
are presented. For projects, relying on VGI, a common problem is the lack of information regarding
the elevation or incline.
3.1.1 Wheelchair routing
Compared to navigation systems for cars, the routing for mobility-restricted people, such as
wheelchair users, elderly people with push chairs or temporarily impaired people, is more complex.
People belonging to one of these user groups may all have slightly different requirements for a
route. This highly depends on the individual disability and the type of assistive equipment
(pushchair, manual wheelchair, electric wheelchair). Ding et al. (2007) studied the requirements for
a wheelchair navigation system, through an empirical study with physically impaired people ans
their assistants. Among other attributes like condition of the sidewalk or information about stairs
and ramps, the street incline is of high relevance for wheelchair routing. Furthermore, Menkens et
al. (2011) performed investigation in the needs of wheelchair users, regarding a navigation system
which meets their requirements. In terms of incline, they found out that the maximum incline
which can be passed with a manual wheelchair is in general between 3% and 8% and for electric
wheelchairs up to 10%.
There are many investigations dealing with the development of routing algorithms meeting the
needs of mobility-restricted people (e.g. Müller et al. 2010; Neis & Zielstra 2014a). The main
problem that exists is the lack of data regarding sidewalk information, surface of sidewalk, curbs
18
Acronym for ‘Pedal Electric Cycle’. Pedelecs are bicycles with an assisting electric engine. The engine
supports the driver while pedaling up to a speed of 25 km/h (according to German Road Traffic Licensing
Act (StVZO)).
3 Related Work
20
and also inclines. Although the data was already acquired by governmental authorities or commer-
cial map providers, it is very costly. Therefore, most of the approaches rely on volunteered
geographic information, for example OpenStreetMap. In OpenStreetMap, that information is
theoretically freely available, but unfortunately hardly existing in the dataset. As stated in section
2.3.3 incline values are only available for 0.2 % of the street segments. Approximately 75 % of
them contain only the values ‘up’ and ‘down’. This only indicates if a street or path is inclined, but
does not specify the value. Due to the different requirements of the people, this is not sufficient and
a more accurate knowledge of incline is required.
Besides incline information, there is also a lack of other accessibility-related information in VGI.
Therefore, many routing services were developed, which allow the user collecting those infor-
mation (Kurihara et al. 2004; Menkens et al. 2011; Völkel & Weber 2008; Harriehausen-
Mühlbauer 2014). The idea is that the users gather information about barriers or obstacles on
sidewalks, while getting navigated. This incrementally improves the quality of the route calculation
and also ensures that temporary barriers are acquired. The incline of streets needs to be measured
and can consequently not be determined with those systems. This shows the demand for an
alternative way to determine incline values of a street network.
3.1.2 Energy-efficient routing
Electric powered vehicles, such as E-cars, Pedelecs or e-wheelchairs are getting more and more
popular, although according to Bachofer (2011) people are still skeptical. This can be explained
with high costs, long time for charging the battery or shorter distance range. In addition, a reason
may also be the poor prediction of distance range. Depending on the properties of the street, the
power consumption may vary. The surface material as well as the incline of the street decreases the
battery service life. Depending on the speed, the energy demand increases with 50% to 100% on an
incline of 4% (Bachofer 2011). Although the travel distance is longer, it might be of benefit if the
user takes a route around a hill or avoids streets with bad surface. Consequently, the knowledge of
the incline is an important factor to estimate the distance range per battery life. With a routing
service that considers the energy consumption of a street segment, the energy demand of a route
can be determined. This allows us the possibility to choose the most energy-efficient route or at
least gives a prediction of the battery’s distance range. This research field is known as EcoRouting
or Green Navigation (Bachofer 2011). Although, it is not that relevant for fuel powered vehicles,
since they have a bigger distance range and can use a denser network of gas stations, with
EcoRouting the fuel consumption and therewith the carbon dioxide emission can be reduced.
Franke et al. (2012) developed an algorithm for energy-efficient routing of electrically powered
vehicles. The resulting navigation system is called eNav. Firstly, it calculates the power consump-
tion for each edge of the routing network, using the length of the edge and incline information.
3 Related Work
21
Secondly, edges which cannot be passed by wheelchairs (e.g. because of steps) are rejected and
accessibility information about edges and Points of Interest (POIs) are requested from other
platforms like rollstuhlrouting.de19
or Wheelmap20
need to be requested to get. In the third step,
surface information is included in the routing algorithm. OpenStreetMap has been taken as data
source for the street network and for information about the street surface. For the calculation of the
incline the authors used airborne laser scanning data, which is of high accuracy but also very
expensive.
Sachenbacher et al. (2011) and Kono et al. (2008) also investigated the topic of energy-efficient
routing. The motivation of Sachenbacher et al. (2011) can be found in the field of electric mobility
and they developed an algorithm for energy-efficient routing using OpenStreetMap street data and
the SRTM DEM with a horizontal resolution of 90 m. Contrary to the aforementioned investiga-
tions, Kono et al. (2008) tried to minimize the fuel consumption of conventional cars, by develop-
ing an eco-friendly routing algorithm. To do so, they consider traffic information, geographic
information and even vehicle parameters. As elevation data they use a DEM with a horizontal
resolution, provided by the Geospatial Information Authority of Japan (GSI).
A freely available alternative to GPS traces for the derivation of incline value is SRTM. Bachofer
(2011) analyzed the influence of the accuracy of DEM onto energy-related routing. He integrated
different DEMs into a routing system and found out that the accuracy of the DEM does influence
the modelled energy demand only to a minor degree. Consequently, he concluded that SRTM data
is sufficient for this use-case, however, it has also been discovered that for some routes the
modeled energy demand was more than 30% wrong.
3.2 Extraction of Street Attributes from user-generated Movement Trajec-
tories
Mining street information out of user-generated GPS traces has already been investigated by
several researchers. However, the focus was in deriving 2D information only and to the best of my
knowledge no literature was found, where the elevation of user-generated GPS traces was used to
derive 3D information. As shown in the following section 3.3, high-accuracy GPS measurement
techniques have been used to derive elevation related information.
Van Winden (2014) proposed algorithms to automatically derive different road attributes, like the
direction of the road (one or two way), speed limit or number of lanes. As GPS input data he used
GPS traces acquired from 800 people during a certain time span. Therefore, the transportation
19
http://rollstuhlrouting.de, checked on 15/07/2015 20
http://wheelmap.org, checked on 15/07/2015
3 Related Work
22
mode was known. The input data of the street network to be updated was taken from Open-
StreetMap. Like in typical data mining processes (cf. section 2.4), the data needed to be prepro-
cessed. The GPS traces had to be semantically linked to a street (map matching) on which the trace
was recorded. For this step the algorithm by Marchal et al. (2005) was used by requesting an
application programming interface (API).
Map matching was also an essential step in the research of Zhang et al. (2010). They used
GPS traces collected by the contributors of the OpenStreetMap project and aimed to derive street
attributes like the number of lanes and turning-restrictions. Furthermore, they used the traces to
automatically correct the street centerline from the street network when this is geometrically
incorrect. For this purpose, a map matching algorithm was implemented which is described in
detail in section 3.4.2. Additionally, they did an analysis of the coverage of the GPS traces, which
gives a first idea of what can be expected from this research. In their test area they discovered that
highways have 30 to 80 GPS traces whereas city roads have less than 20. Secondary roads in a
neighborhood have only a few or even none GPS traces. With a high redundancy, better results can
be achieved.
3.3 Derivation of 3D information, using high-accurate GPS measurements
In this section, research is present which involves the derivation of 3D information from GPS,
collected using high-accuracy GPS measurements. Due to the relative high accuracy, the redundan-
cy is not as crucial as in the work presented in section 3.2. To achieve a higher positioning
accuracy different methods have been used.
Boucher (2013) used SBAS-GPS receivers to estimate the height of a street network. SBAS21
is a
geostationary satellite augmentation system to support GPS. It sends correction data and improves
therefore the accuracy of GPS from 10m to 2m. The system which covers Europe is called
‘European Geostationary Navigation Overlay Service‘ (EGNOS)22
. The collected GPS traces were
fused with OSM street network data and the SRTM-3 DEM. The proposed method relies on GPS
measurements, acquired under good conditions. The roof of a car was equipped with two SBAS-
GPS antennas and the car was only driving roads with open environment. Therefore, error sources
which are common in crowdsourced GPS-trajectories like obstruction through buildings or
multipath effects are mainly eliminated in this data. To resolve the remaining error, the SRTM-3
DEM is used to correct the discrete height of the GPS-measurements. Matching the GPS traces and
the road network was done using a statistical method, which makes use of the Mahalanobis
distance. The 3D road network was derived by fusing the three data sources sequentially using
21
http://en.wikipedia.org/wiki/GNSS_augmentation, checked on 15/07/2015 22
http://www.essp-sas.eu/introducing_egnos, checked on 15/07/2015
3 Related Work
23
Kalman filter techniques. According to an experimental validation the road elevation estimation
could be improved using GPS trajectories in addition to the SRTM-3 DEM.
The following work achieved even higher accuracies, by using differential GPS with temporal base
stations (cf. section 2.1.1). Han & Rizos (1999) were motivated by the World Solar Challenge, a
special race for solar-powered cars across the Australian continent. The objective was to determine
the height profile of the road in order to optimize the race strategy. A car equipped with a differen-
tial GPS device drove from Darwin to Adelaide, convoyed by two cars acting as reference stations.
The road was divided in sections and for each section the reference stations were parked at the
beginning and the end. To derive the height information a spatial Kalman-filtering technique was
used to predict the incline information.
3.4 Map Matching
In the field of navigation it is important to know on which street the carrier of a GPS device is
traveling. Furthermore, the position on that street segment is of importance. To solve this problem,
the recorded trajectory data of the moving object and the segments of the street network data need
to be semantically linked (cf. Figure 8). These algorithms are in the literature referred to as Map
Matching (e.g. Quddus et al. 2007; Marchal et al. 2005). One may think that this is a straightfor-
ward task, but due to inaccuracies of both input data sources it is more complicated. Especially in
regions where the GPS signal is generally of low quality (e.g. urban areas, forest) and a dense street
network exists, the quality of map matching may vary. Furthermore, errors and inaccuracies in the
street network data may cause a wrong match of the trajectory data and the street network.
Figure 8: Map Matching. The GPS trace (blue) is snapped to the street network (red) (Map: OSM)
3 Related Work
24
3.4.1 Categorization of Map Matching Algorithms
Quddus et al. (2007) reviewed different map matching algorithms. They categorized the algorithms
in four different groups of approaches. The first group is about geometric approaches. They
exclusively rely on the geometry of the trajectory and street network data and do not consider the
topology. This means that the connectivity of street segments is not used in the matching process.
Geometric algorithms take the geometry of either the single point positions or the trajectory as
curve and search the closest node within the street network or the closest curve. Consequently, the
approaches are called point-to-point, point-to-curve or curve-to-curve matching. The geometric
approaches are generally faster in the processing and easy to implement. Secondly, there is the
group of topological approaches. In addition to the geometry of trajectory and street network data
they make use of the relationship between the segments of the street network. Two street segments
may for example be connected or disjoint. For the presented algorithms, the topology of the street
network was analyzed in advance. The third group is the group of the probabilistic map-matching
approaches. For those approaches, the error of the GPS measurement is taken into account in the
form of an error ellipse. Using the error ellipse it is searched for intersecting street segments, which
are considered as matching candidates. In case, there is more than one candidate, properties like
speed or direction of the trajectory are used to detect the correct street segment. Advanced map
matching algorithms represent the fourth group. Algorithms are described which uses more
advanced techniques, such as Kalman filtering or other mathematical models. For all algorithms the
assumption is made, the GPS trace was recorded while travelling along a street, rather through
areas where no street can be found.
In the following section three Map Matching algorithms are described briefly. The first two (
Marchal et al. (2005) and Karussel (2014)) are already implemented and ready to use. Both
algorithms are designed for post-processing applications only and may potentially be used in this
research. The third algorithm proposed by Zhang et al. (2010) is not yet implemented, however, it
is easy to do.
3.4.2 Functionality of Selected Algorithms
The algorithm proposed by Marchal et al. (2005) is a topological algorithm, as it uses information
about the connectivity of the street segments. As already mentioned, this map matching algorithm
is implemented in the online service called Trackmatching23
. The service provides an API, which
can be requested with a set of GPS points as input. As a response, the user gets a set of IDs,
referencing the traveled street segments from OpenStreetMap. A disadvantage of this algorithm is
that it is limited to the street network and GPS traces cannot be matched to sidewalks or bicycle
lanes. This makes it unsuitable for applications, where the transportation mode can also be cycling
23
https://mapmatching.3scale.net/, checked on 15/07/2015
3 Related Work
25
or walking. In short, the algorithm works as follows: The incoming GPS points are processed
sequentially and matched according to their distance to a street segment, which is connected with
the previous matched segment. The first step is the initialization process. Starting with the first
GPS point, the three closest street segments are searched by calculating the Euclidean distance.
Using these segments, new candidate paths are created. Each candidate path has a score (weight),
which is the sum of the distances between the GPS points and the segment. The path with the
smallest score and the smallest cumulative distance is considered as the traveled route. If the end of
a street segment is reached, the algorithm searches for street segments, which touch the end node.
All touching segments are now considered as matching candidates. For all candidates the cumula-
tive distance to the GPS points is calculated. Again, the segments with the lowest score is consid-
ered as the traveled path.
Another approach is the algorithm by Karussel (2014). According to the categorization by Quddus
et al. (2007) it may be categorized in the group of advanced algorithms, as it uses a routing engine
to estimate a path, rather than using the street network as input. The algorithm is implemented in
Java and is published24
under the Apache License 2.0 and can therefore be used freely. The
algorithm is uses the routing engine Graphhopper25
, a route planner which is based on OSM.
Firstly, for each GPS point the three closest street segments are searched and weighted. The weight
of an edge is the shortest distance to the GPS point. Once each GPS point has three weighted street
segments, the routing engine is requested to find the best path along all the selected street seg-
ments. The best path is the one, where the sum of all weights is the smallest. This makes this
algorithm unsuitable for real-time applications. The advantage of this approach is that a realistic
path is found, even if the GPS trace is interrupted (e.g. in tunnels or in dense forests). However, to
calculate a path, Graphhopper needs to know whether the track was recorded while walking,
cycling or driving a car. This is a disadvantage for applications where the transportation mode is
not known. Another disadvantage is that the results are only routes, computed by the route planner.
This probably leads to mismatches when a street was taken although it is not allowed (e.g. a
pedestrian walking in opposite direction on a one way street).
The third algorithm is proposed by Zhang et al. (2010). It is mainly a geometric algorithm, but also
uses a clustering method, therefore, it may also be categorized as an advanced method. Like the
aforementioned two algorithms, the OSM street network is used as input data. Within the algo-
rithm, three conditions are checked using the street segment and the GPS traces: distance, direction
and angle. Note that this method uses the GPS traces as curves, rather than processing the GPS
points sequentially. First of all, profile lines perpendicular to the street are created with a specific
distance to each other. The length of the lines is 30 m which have been found to be a reasonable
24
https://github.com/graphhopper/map-matching, checked on 15/07/2015 25
https://graphhopper.com/, checked on 15/07/2015
3 Related Work
26
value considering the error of the GPS traces and the width of the street. All traces which intersect
the perpendicular lines are firstly seen as matching candidates. In case of a one-way road, a
candidate is removed from the list if the GPS trace has the opposite direction than the one-way
road. By convention, the direction of one way roads in OSM is specified through the digitization
direction. The third condition is the angle between the trace and the road. If this angle is greater
than 20 degrees, the GPS trace will also be removed from the candidate’s list. The three conditions
will select the corresponding traces, but it can still yield mismatches (false positives), if two two-
way roads are parallel and too close to each other. This if often the case, where there are street-
accompanying bicycle lanes and sidewalks. In this case, the GPS traces are matched using a
clustering method.
3.5 Smoothing of Time Series Measurements
The input data for this research are GPS traces collected by user with low-cost GPS devices. As
stated in section 2.1, a certain noise is expected in the data. According to Haining (2003), smooth-
ing algorithms can be used to remove the noise and improve the accuracy of the derived infor-
mation. In his book, he reviews different smoothing methods and techniques. Non-linear smoothing
methods such as median smoothing and linear smoothers like mean smoothers are mentioned.
Linear smoothers shall be chosen if there are no abrupt changes expected in the data. This is due to
the fact that peaks or small-scale features will be removed by averaging values instead of taking the
median. The elevation profile of a street on which the GPS traces have been recorded usually
follows a continuous line. Consequently, the elevation profile of the GPS measurements should
theoretically not contain any discontinuous measurements and if so, they can be considered as
outliers and shall be flattened.
For linear or non-linear smoother, a window of certain size is fixed with its center on each data
point of the series. The data point on which the window is fixed will be assigned with a smoothed
value, considering all neighboring data points within the specified window. Out of the data points,
a weighted average or the median can be used as the new value. When determining the weighted
average, the weights can be selected with regard to the distance between the data points, however,
they are normalized to one (Haining 2003, p. 231). Points further away consequently influence the
smoothed value less than closer points. The size of the windows and the number of data points,
which influence the result, should be chosen depending on the desired degree of smoothing. A
bigger windows size increases the information used of points, which are further away. This leads to
a higher precision, although it can also lead to biases. A smaller window size decreases the risk of
introducing a bias, but the precision is lower, because a smaller sample and consequently less
information are used. (cf. Haining 2003, p. 229)
3 Related Work
27
In the following, Tukey's (1977) ‘Median of 3’ algorithm is explained in more detail. Although it is
a non-linear smoothing algorithm and thus not likely to be used in this research, the functionality is
very similar to a weighted moving average smoother.
Tukey (1977, pp. 210-213) presents an algorithm for smoothing equidistant sequences of numbers,
namely ‘Medians of 3’. To start with the smoothing, the second number as well as the previous and
following number are selected. It has to be started with the second number, since the first does not
have a left-hand side neighbor. This yields to no value for the first and respectively last spot of the
sequence. From the selected three numbers, the second will get a new (smoothed) value assigned
considering the two neighbors. After ordering the selected numbers, the median can be determined
easily. The determined median is then assigned to the second value. This process is then repeated
with all values of the sequence. It is important that the neighbor on the left-hand side is selected
from the raw data, rather than from the smoothed data. The smoothing can then be applied to the
smoothed sequence again, to achieve a higher degree of smoothness. Figure 9 shows a raw
sequence of numbers (row 1) as well as a single (row 2) and repeated (row 3) smoothed data row.
Figure 9: Example of 'Median of 3' -smoothing with the raw data (row 1) and the results using the single median
smoothing and the repeated meadian smoothing. (Tukey 1977, p. 212)
This approach problematic as the sequence gets shorter because of the missing values at the end.
To solve this problem Tukey (1977) proposed two approaches. The first and simplest one is to just
copy the first value from the raw data to the smoothed values. The second and more complicated
one is the so called ‘end-value smoothing’. The last and missing value is the median of the
following values derived only from the first three values of the sequence. The corresponding
number from the above mentioned example in Figure 9 is shown in parenthesis.
Value 1: the actual raw value (13)
Value 2: second number from the first smooth (9)
Value 3: The second value of the first smooth (9) plus two time the difference of the second and
the third value of the first smooth (9,7): 9 + 2* (9-7) =13
Consequently, the first value of the smoothed sequence will be 13. An illustration for better
understanding is shown in Tukey (1977, p. 222).
13 7 9 3 4 11 12 1304 10 15 12 13 17 20 24
- 9 7 4 4 11 12 12 15 12 13 13 17 20 -
- - 7 4 4 11 12 12 12 13 13 13 17 - -
4 Methodology
28
4 Methodology
The Methodology describes all steps which are necessary for the derivation of incline values of the
street network. First of all, the pilot region in which the approach is tested will be defined.
Secondly, the used tools and data will be described in detail. It follows a detailed description of the
workflow and implementation, including data import, preprocessing, map matching and the
calculation of the incline. In the last section of this this chapter it is described, how the derived
information can be validated.
4.1 Definition of Pilot Region
The pilot region for this research is the region around Heidelberg in the south-west of Germany. In
Figure 10 the extent of the region it indicated by the red bounding box. The region was chosen
since is it also one of the pilot regions for the project CAP4Access. Projected, the area is almost a
square with a side length of approximately 22 km. This results in an area of approximately
497 km². The area is bounded by the localities Leutershausen in the north, Neckarsteinach in the
east, Nussloch in the south and in the west it reaches almost to Mannheim. The region is character-
ized by mountainous and forested areas in the east as well as flat urban areas and farmland in the
west. This is particularly suited for this research, since it makes it possible to differentiate the
results between land use classes and other characteristics.
Figure 10: Pilot region Heidelberg / Germany. (Map: OSM)
4 Methodology
29
4.2 Tools
In order to work on this topic and implement the steps of the workflow, a range of tools were used.
As programming languages Java has been used to implement the main part of the software. The
extensive number of spatial algorithms implemented within the Java libraries GeoTools26
and Java
Topology Suite (JTS)27
made it convenient and efficient to work with geometries. Both libraries
are licensed under the Open-Source license LGPL and can easily be integrated in the project using
Maven dependency management28
. Maven is a tool which helps to build java projects and manages
the integrated libraries. As an integrated development environment (IDE), Eclipse has been used.
An IDE supports programmers with a source code editor, syntax highlighting, recommendations
and a useful debugging tool.
To store the input data as well as the results, a PostgreSQL (PGSQL) database has been used. In
order to work with spatial data the extension PostGIS has been added to the installation of PGSQL.
It also provides many functions which implement geometric algorithms, however, the processing
was mainly done in Java.
For visualization of intermediate results and some preprocessing tasks the GIS-tools ArcGIS and
one of its Open-Source alternatives QGIS has been used. QGIS was used in addition to ArcGIS,
since connecting to the database and loading the data is very easy and intuitively.
4.3 Data
The main two data sources for this research are the crowdsourced GPS traces and the street
network. Both data sources are described in detail in the following sections. Furthermore, data is
described, such as land use classes and digital elevation models (DEMs), which is not needed for
the calculation of the incline, but used for the evaluation of the result.
4.3.1 Crowdsourced GPS traces
Crowdsourced, or user-generated GPS traces, are one of the data sources for the determination of
street incline. Firstly, this section gives an overview of different platforms, projects or application
in which GPS traces are crowdsourced by volunteers. Secondly, the format for exchanging
GPS traces, GPX, is introduced. After that, it is described how GPS traces are handled within the
OpenStreetMap project, since the GPS traces of OSM will be used for this research. At the end,
typical errors within the data are discussed briefly.
26
http://www.geotools.org, checked on 15/07/2015 27
http://www.vividsolutions.com/jts/JTSHome.htm/, checked on 2015/05/22 28
http://maven.apache.org/, checked on 22/05/2015
4 Methodology
30
4.3.1.1 Platforms and Devices
There are several platforms and applications in which GPS traces are collected for different
purposes. One example are sport-tracking apps for smartphones, such as Strava29
, Runtastic30
or
Runkeeper31
which track the user’s way while training, to provide statistics about the activity such
as distance, average speed, total climb or the elevation profile. Other examples are platforms such
as gpsies.com32
which purpose is to exchange and recommend traveled routes for outdoor
activities. The collection of GPS traces within the OpenStreetMap project has the purpose of
supporting the map making.
The devices which are usually used to record GPS traces have integrated low-cost GPS receivers
(Heipke 2010), such as smartphones or handheld GPS devices. Depending on the device or
smartphone-app used, the elevation information of the track points may originate from a different
source than GPS. Some devices, especially handheld GPS devices, have built-in barometers, which
determine the elevation by measuring the change of air pressure. This can lead to high systematic
errors, if the barometer is not calibrated properly. Another source for crowdsourced GPS traces are
elevation databases. The sport-tracking services Runkeeper and Strava replace the measured
elevation by the GPS receiver with values from an elevation database. Due to the poor vertical
accuracy of GPS (cf. section 2.1.2), a plotted elevation graph or the calculated total climb may be
wrong. While Strava uses an elevation database without mentioning the source of elevation
information, Runkeeper uses the third-party service topocoding.com, which is based on elevation
information from SRTM33,34
. However, both services so not specify, if the measured elevation is
only replaced for calculation and visualization or if the exported GPX files also contain the
elevation values from the database rather than the original measurements. GPS traces, uploaded to
the OpenStreetMap project, might be recorded from the aforementioned apps, which results in the
problem, that the GPS traces may potentially contain elevation information, which is actually not
measured by GPS, but taken from other sources. Depending on the device or smartphone applica-
tion, the elevation must not necessarily reference the WGS 84 ellipsoid (cf. 2.1), but could also be
referenced to the mean sea level.
29
https://www.strava.com/, checked on 23/06/2015 30
https://www.runtastic.com/, checked on 23/06/2015 31
http://runkeeper.com/, checked on 23/06/2015 32
http://gpsies.com/, checked on 23/06/2015 33
https://strava.zendesk.com/entries/20965883-Elevation-for-Your-Activity, checked on 22/05/2015 34
https://support.runkeeper.com/hc/en-us/articles/201109736-How-does-RunKeeper-calculate-elevation-and-
climb- , checked on 22/05/2015
4 Methodology
31
4.3.1.2 The GPX Format
For the exchange of GPS traces from the device to one of the aforementioned platforms, the GPX
format is commonly used. GPX is the abbreviation of ‘GPS Exchange Format’ and is an XML-
based format. Figure 11 shows an example GPX file, which is an instance of the GPX schema of
version 1.135
. The root element ‘gpx’ contains information about the version and the schema
location as attributes. As child elements there are waypoints (‘wpt’) and tracks (‘trk’). Waypoints
are points which have been stored separately in order to mark locations, such as point of interests.
The track element contains the actual GPS trajectory. Within this work it is referred to as GPS trace
or simply trace. A trace may contain several track segments (‘trkseg’) which again contain track
points (‘trkpt’). The latter must at least be described by the attributes longitude and latitude. The
coordinates are given in the geographic reference system WGS84. Additional optional information
can be stored as child elements, such as timestamp or elevation. (cf. Ramm & Topf 2010, p. 26)
Figure 11: Example GPX file.
35
http://www.topografix.com/gpx/1/1/gpx.xsd, checked on 23/06/2015
<?xml version="1.0" encoding="UTF-8"?>
<gpx xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance" version="1.1" xsi:schemaLocation="http://www.topografix.com/GPX/1/1
http://www.topografix.com/GPX/1/1/gpx.xsd">
<wpt lat="49.4056396484375" lon="8.684947967529297">
<name>Castle</name>
</wpt>
<wpt lat="49.410247802734375" lon="8.692361831665039" />
<trk>
<name>example gps track</name>
<trkseg>
<trkpt lat="49.411144" lon="8.705768">
<ele>194.31606</ele>
<time>2015-05-29T16:18:32Z</time>
</trkpt>
<trkpt lat="49.41126" lon="8.70587">
<ele>250.8594</ele>
<time>2015-05-29T16:18:34Z</time>
</trkpt>
.
.
.
</trkseg>
</trk>
</gpx>
4 Methodology
32
4.3.1.3 OpenStreetMap GPS traces
The main input data for this research are the GPS traces collected by OSM contributors. The so-
called gpx-planet file36
contains all original GPS traces as GPX files from all over the world. The
latest version of this file is dated from April 2013, however, the script to create the dump is online
available37
. It needs to be applied to the OpenStreetMap database, to which only OSM administra-
tors have access. Therefore, a new dump cannot be created and the one from April 2013 has to be
used.
The data was collected by thousands of users and contains more than 2.5 billion track points. As
shown in Figure 12 , which depicts the amount of GPS track points per grid cell, the majority of
points can be found in Europe. Especially in Germany, Austria and Switzerland a higher density
than in other European countries, like Spain, can be observed. This may be due to the higher
population density or a higher motivation in collaborating in such projects.
Figure 12: Screenshot of grid map, shown the number of GPS points per grid cell.38
There are also regional extracts of the GPX-planet file available39
, which make processing of the
dataset more convenient, if one is only interested in a certain region. Besides the gpx-planet file as
data source, there is an Application Programming Interface (API)40
, which allows the user to access
the track points within a given region, using HTTP requests. On the website of OSM, there is a
36
http://planet.openstreetmap.org/gps/, checked on 22/05/2015 37
https://github.com/iandees/planet-gpx-dump/, checked on 23/06/2015 38
Screenshot taken from http://resultmaps.neis-one.org/osmgps.html, checked on 22/05/2015 39
http://zverik.osm.rambler.ru/gps/files/extracts/index.html, checked on 22/05/2015 40
http://wiki.openstreetmap.org/wiki/API_v0.6#GPS_traces, checked on 22/05/2015
4 Methodology
33
public list showing the uploaded traces. The upload of traces can be done manually using an online
form41
or the API and HTTP POST. In order to supply additional information about the trace, a
description must be given and a comma-separated list of keywords can be added. This makes it
possible to search and find traces by keywords. The uploaded traces must also have assigned a
visibility. Some users might not want to be linked to the uploaded traces, since one may draw
conclusions about the user’s location and movement profile. Table 3 shows the four options
‘identifiable’, ‘public’, ‘trackable’ and ‘private’ and their explanations.
Visibility Description
Identifiable - shown in the public traces list
- points with timestamp served over the API
- contained in planet-gpx file
- link to trace page via API
- Conclusion about the contributing user can be drawn.
- Access to raw GPX-file possible via trace page
Public - shown in the public trace list
- points with timestamp served over the API
- contained in planet-gpx file
- not linked to trace page via API
- access to raw GPX-file only via public trace list
Trackable - not shown in the public trace list
- points with timestamp served over the API
- contained in planet-gpx file
- not linked to the trace page
- no access to raw GPS-file
Private - not shown in the public trace list
- points without timestamp served over the API
- not contained in planet-gpx file
- not linked to the trace page
- no access to raw GPS-file
Table 3: Overview of visibility options for the upload of GPS traces42
The gpx-planet file has been imported and the public traces list has been requested to access the
uploaded traces after August 2013 (which are not contained in the dump). In total, 4194 GPS traces
from the gpx-planet file are within the pilot region. Out of it, 86% (3606) have elevation infor-
mation and can therefore be used for this research. With additional traces from the public trace list,
the number increases to 3842 traces. In total, there are over two million GPS track points in the
area of the pilot region (~497 km²). Assuming that the pilot region was a square with a side length
of 22,000 m and the points were evenly distributed, there is one GPS-point every 15 m x 15 m.
41
http://OpenStreetMap.org/traces, checked on 22/05/2015 42
source: http://wiki.openstreetmap.org/wiki/Visibility_of_GPS_traces, checked on 20/07/2015
4 Methodology
34
4.3.1.4 Typical Errors
As already mentioned in 2.1.2, GPS measurements suffer from multiple errors and inaccuracies.
Therefore, the elevation profile of a GPS trace always contains noise, meaning that neighboring
points on a flat terrain will often have different elevation values. Figure 13 shows the elevation
profile of a GPS trace on a fairly flat street. It can be seen that the elevation measurement always
increases and decreases within a range of ± 2 m. An analysis regarding the elevation accuracy of
crowdsourced GPS data is given in section 5.1.
Figure 13: Elevation profile of a GPS trace, recorded on a flat street.
Additional to the noise, it may also happen that there is a lack of GPS signal and the receiver loses
the position fix. Then, the next point of the trace is the point, when there is again signal to the
satellite. This may for example happen in tunnels or in other situation where no signal can be
received. This phenomenon results in traces, in which long distances between two adjacent points
can be found and which do not represent the course of the street. Figure 14 shows a few examples.
Figure 14: GPS traces with lost GPS-signals in tunnels. (Map: OSM)
4 Methodology
35
4.3.2 Street Network
The street network will be enhanced with the calculated incline values. For this research, potential-
ly every collection of street geometries may be used, however, OSM was chosen as the data source.
OpenStreetMap data is open source and its data model is commonly known in the domain of
volunteered geographic information. The streets are represented as LineStrings, or in terms of
OSM, ways. Rather than the outline, the geometry specifies the centerline of the street. The streets
are classified, using different tags. For the pilot area, 57.824 street elements with a total length of
around 5336 km were extracted. Table 4 shows the value used in combination with the key
‘highway’, which were used to extract the streets from the OpenStreetMap dataset. It furthermore
shows the share from the total length of each street type in percent. The street network is composed
out of different types of streets and paths. This includes ways which are dedicated to cars,
pedestrians, cyclists or a combination of the aforementioned. For convenience and to avoid
confusion it has to be noted, that the individual parts of the street network will in this thesis be
referred to as ‘street’, although it includes also paths, which cannot be used by car.
value Description43
share in %
track agricultural, forestry streets 43.84
residential streets within residential areas 18.38
path
mainly hiking trails and small
paths 9.56
footway for pedestrians only 8.13
secondary country road of second priority 4.34
tertiary country road of third priority 2.84
cycleway for cyclists only 2.62
living_street
streets, where pedestrians have
priority over cars 2.01
motorway Equivalent to autobahn 1.98
unclassified
roads with minor priority than
tertiary 1.91
primary
country road with highest
priority 1.33
others44
3.06
Table 4: Values of highway tag and their share of length in percent.
The relatively high share of streets with the tag ‘highway=track’ can be explained with the high
occurrence of forest and fields in the pilot area. The length of footways and bicycle lanes is
compared to the residential road relatively small, although residential streets often have adjacent
footways. Instead of mapping footways as an individual way, the information can also be added as
tag to the street geometry. The same holds true for bicycle lanes. As already mentioned in section
43
http://wiki.openstreetmap.org/wiki/Map_Features#Highway, checked on 27/05/2015 44
trunk, motorway_link, pedestrian, trunk_link, secondary_link, primary_link, road, tertiary_link
4 Methodology
36
2.3.1, it is obvious that streets important for pedestrians or wheelchair users have a large share.
This is another argument for using OpenStreetMap data instead of commercial street network data.
4.3.3 Land Use Information
Information regarding the land use is used in order to classify the results in chapter 5. Also for this
reason, the data of OpenStreetMap was taken to extract polygons with certain land uses. There is an
extensive list of tags in the OpenStreetMap Wiki (2015b) which indicates different land use
classes, such as ‘residential’, ‘forest’ or fields. Since different land uses usually have different
characteristics in terms of visibility of satellites and obstruction through man-made structure, it the
results are expected to be dependent on the land use. There are many tags which describe areas
with different land uses, but only the seven most common tags have been extracted. Table 5 shows
the different land uses and their characteristics. Rural areas like farmlands and allotments are
characterized by fields and mainly low buildings. This means there are almost no structures which
may influence the GPS signal through multipath effects or shadowing. In OpenStreetMap, there are
two tags for farmland used (‘landuse=farm’ and ‘landuse=farmland’). According to the wiki, both
tags mean the same, although farmland should be preferred over farm. Therefore, both term will be
combined, and termed as ‘farmland’. Urban areas such as commercial, industrial or residential
areas are characterized by taller buildings and urban canyons (Langley 1999) where multipath and
shadowing effects are more likely. Through the quick change of reflected signals and from
shadowing to open view (at cross roads), a heterogeneous GPS quality can be expected. In forested
areas which are mainly covered by a dense tree canopy, it is very likely that the GPS signal is
shadowed. In these areas a homogenous degraded GPS quality can be expected.
4 Methodology
37
value of key ‘landuse’ Characteristics
allotments - Gardening
- Small buildings
commercial - Office buildings
farm
- Land used for farming
- No buildings, no trees
- Tillage and pasture
farmland - Same as ‘farm’
- Should be used instead of farm
forest - High trees
- Dense canopy
industrial
- Factories or warehouses
- Wide streets for trucks and delivery
vehicles
- Less obstruction then residential
residential
- Urban environment
- Tall buildings
- Narrow streets, trees
Table 5: OSM landuse-tags and their characteristics.
4.3.4 Digital Elevation Models
Within this research, digital elevation models (DEMs) are used during the evaluation of the results.
A DEM is a representation of the earth’s surface and may be acquired with different methods such
as terrestrial surveying or remote sensing techniques like stereo photogrammetry, radar systems or
LiDAR (Laser Detection and Ranging). Only with airborne or spaceborne remote sensing
techniques it is possible to acquire data for a larger region in a reasonable time. Out of the
measurements a DEM can be generated, using certain data analysis techniques such as interpola-
tion. ‘Digital elevation model’ is a general term which is used for both specifications: ‘digital
surface model’ (DSM) and ‘digital terrain model’ (DTM). There are different definitions of the
terms (Zhilin et al. 2005; Cartwright et al. 2007), however, the terms will in this research be used as
follows. As shown in Figure 15, DSMs and DTMs can be differentiated by what structures are
included in the model. While a DSM is the representation of the earth’s surface including all
objects like trees and buildings on it, in a DTM those objects are excluded. Using remote sensing
techniques, the surface is measured (e.g. top of building, top of tree canopy) and consequently, a
DTM needs to be corrected in order to exclude objects which are located on the terrain.
4 Methodology
38
Figure 15: Difference of DSM and DTM.45
DEMs can also be classified in terms of horizontal resolution. Czegka et al. (2004) classify DEMs
in high-resolution DEMs with a horizontal resolution with a cell size smaller than 10 m, medium-
resolutions DEM with a cell size of 30 m to 100 m and low-resolution DEMs with cell sizes greater
than 500 m.
For the validation within this research, a high-resolution DEM will be used as reference data in
order to compare the derived incline values. Furthermore, it will be made use of it for the error
analysis of the GPS points. The DEM was computed from LiDAR measurements by ‘Landesamt
für Geoinformation und Landentwicklung Baden-Württemberg’ between the years 2000 and 2005.
It represents the terrain, excluding building and trees and is consequently a DTM. Areas, which
couldn’t be measured, like building or very dense forests, were interpolated using neighboring
points. The available data covers the entire pilot region with a horizontal resolution of 1 m and a
vertical accuracy of 0.5 m. The DTM was given as a point txt-file, containing evenly distributed
XYZ-coordinates. With the help of the function ‘Point to Raster’ of ArcGIS, the DTM has been
converted to a georeferenced raster file. The coordinates were given in Gauss-Krüger coordinate
system. The elevation values are heights above the sea level and are therewith referencing a
quasigeoid. Contrary to a geoid, a quasigeoid is not an equipotential surface, however, it deviates
from the geoid in a mm or cm range (cf. Torge 2001, p. 82). Since the GPS-points as well as the
street network data is given in the World Geodetic System 1984 (WGS84), the horizontal datum of
the DTM is transformed to WGS 84. This enables the evaluation of the GPS accuracy.
Next to the high-resolution DTM, the SRTM-1 DEM shall be used to compare the calculated
inclines with data which is, like crowdsourced GPS-points, free and globally available. The SRTM
DEM is acquired from the satellite mission SRTM (Shuttle Radar Topography Mission). To be
more accurate the SRTM DEM is a DSM, since objects on the earth’s surface are not reduced. The
45
Adopted from: http://www.gsi.go.jp/WNEW/TEC-NEWS/2007-tec172.html, checked on 22/05/2015
DTM
DSM
4 Methodology
39
SRTM mission is a joint project from NASA, the National Geospatial-Intelligence Agency and the
German as well as the Italian space agencies. The DEM is available between 60° north latitude and
54° south latitude and covers therewith 80% of the land on the earth. The data is available in the
datum WGS84 as horizontal reference system and EGM96 as vertical datum. The elevation refers
therefore to the geoid, rather than to the ellipsoid. The absolute height error in Eurasia was
identified by Farr et al. (2007) as 6.2 m. Another free available DEM is the ASTER Global DEM.
It was compiled from data collected by the “Advanced Spaceborne Thermal Emission and
Reflection Radiometer”, mounted on the Terra spacecraft. It has, like SRTM, a horizontal
resolution of 30 m and covers the land between 83° north and 83° south latitude. Although, the
coverage is better than the one of SRTM, the vertical accuracy is worse with only 9.2 m (Meyer
2011). Therefore, SRTM, as the most accurate and almost globally available data source, has been
chosen to be used for the evaluation.
4.4 Workflow and Implementation
To get to the final result of having street segment enhanced with incline values, several steps need
to be performed, as depicted in Figure 16. After the import of both, the street network data and
GPX-files, the data set need to preprocessed in order to prepare it for the following steps. The
GPS traces must firstly be linked to the individual street segments. This step is known as Map-
Matching and detects GPS traces which were recorded on the street segments. After that, the
incline of the street segment can sequentially be calculated, with the use of the assigned
GPS traces. Then, the calculated incline values are compared with the incline values calculated
from the LiDAR DTM and the SRTM DSM. In the following sections, all individual steps will be
described in more detail.
Figure 16: Process of deriving incline information out of user-contributed GPS traces.
Since, the tools which are developed for the purpose of this thesis may also be of interested for
other users and use-cases, they are planned to be published under an Open-Source license.
Therefore each step is implemented as individual tool and all intermediate results are saved in the
4 Methodology
40
database. Then the single steps can be individually used. The tools will be designed as generic as
possible, that they can be used with different data sources. The requirements on the data sources in
terms of modelling will be kept as small as possible. The latest version of each developed tool, can
be found on the CD, attached to this master’s thesis. Additionally, some of the tools are accessible
on the github account of the GISCIENCE Research Group of the University of Heidelberg46
.
4.4.1 Data Import
Before starting the actual process of calculating incline information, the data sources need to be
imported into a PostgreSQL/PostGIS database, running on a local machine. This enables the
storage of the file based GPS data and street network in a relational way and provides fast and easy
access from the Java program. The following subsections describe the process of importing the
GPS data and the OpenStreetMap dump, from which the street network and land use information
are extracted.
4.4.1.1 GPS traces
As mentioned in section 4.3.1, the GPX-planet file with the latest version from August 2013 is used
as the data source and extended with the traces from the August 2013 to now, taken from the public
trace list. The import is realized in Java and the source code is published on the Github account of
the GISCIENCE Research Group47
. The GPX-planet file is packed and compressed as an *.tar.xz
archive and contains on the one hand an XML-file with metadata and on the other hand all GPX-
files in a directory structure. The metadata-file includes information about the traces, like user, user
id, number of points, description and tags. The tar.xz archive does not need to be unpacked in
advance, since this is done on-the-fly using the combination of different Java classes, which handle
the unpacking and reading automatically. The first file to read and parse is the metadata file. All
entries are stored as single objects in a TreeMap, containing the GPX-id and its metadata object. In
a TreeMap the IDs are indexed, which allows faster access then using a conventional HashMap.
Once the metadata is stored in the memory, the GPX-files can be read sequentially. After reading a
file, it needs to be parsed to Java object classes, which is known as unmarshalling48
. The corre-
sponding object classes need to be generated in advance from the XML-schema of GPX version
1.049
.
Once the GPX-file is read it needs to be filtered. The workflow if the filter, which checks two
conditions, is depicted in Figure 17. First, it is checked, if the trace contains elevation information.
For this step it is only necessary to check a single track point, since either all or none track points
have elevation information. If a track point does not contain information regarding the elevation,
46
https://github.com/GIScience, checked on 20/07/2015 47
https://github.com/GIScience/osmgpxfilter, checked on 20/07/2015 48
http://en.wikipedia.org/wiki/Marshalling_%28computer_science%29, checked on 20/07/2015 49
http://www.topografix.com/gpx/1/0/gpx.xsd, checked on 20/07/2015
4 Methodology
41
the trace is skipped and not imported. Otherwise, it will be checked, if the trace falls into the
bounding box of the pilot region. Here, the condition is met when at least one track point is within
the region. Only if no track points intersect the bounding box, the trace will not be imported. The
trace, which meets both conditions, is, together with its metadata, written into a Post-
greSQL/PostGIS database. As mentioned in section 4.3.1, a GPX-file may contain several track-
elements. Hence, each track of a GPX-file is stored as single 3D LineStrings, together with the
GPX-id, track-id and metadata. The track-id is newly introduced and unique within the tracks of
one GPX-file. Figure 18 shows the columns and their datatypes of the relation ‘gpx_data_line’. The
columns ‘gpx_id’ and ‘trk_id’ build the primary key. After writing the trace to the database, the ID
of the trace is put into an ArrayList. This list will later be used, to identify if a trace was already
imported.
Figure 17: Filtering and import of GPS traces.
Figure 18: The schema of the relation 'gpx_data_line' for storing the GPS traces.
4 Methodology
42
Now, all files included in the GPX-planet file are imported. To add the traces which have been
uploaded to OSM after August 2013, the public trace list is scraped for the additional traces.
Scraping50
is the automatic reading of information from web sites. The public trace is split into
pages of 20 traces and each page can be requested with its page number (e.g.
https://www.openstreetmap.org/traces/page/1). The HTML-code of the trace list contains the links
to the actual trace page which again contains the GPX-id and a link to the GPX-file. If the GPS-id
is not contained by the ArrayList created during the import of the GPX-planet file, the file is
downloaded and unmarshalled. In comparison to the GPX-planet file, the files are in the original
version, which was uploaded by the user. Therefore, it can happen that the traces are stored as
different GPX versions (1.0 or 1.1). This is problematic during the unmarshalling process because
the version must be known. Consequently, the first step is to detect the GPX-version and then it
will be unmarshalled, using the correct GPX -schema. Since, all following steps use the object
classes, generated from the GPX version 1.0 schema, all traces from version 1.1 need to be
transformed. After the file was unmarshalled successfully, it follows the same procedure of
checking the file against elevation information and bounding box. After that, the tracks of the file
are written into the database.
As already mentioned, the developed tool is published. In order to make this tool usable also for
other use-cases, all the conditions, which are checked during the import are adaptable. The
arguments must be given when starting the program from the command line. In addition to the
export to PostgresQL, it also supports the output as ESRI shapefile or as database dump in the same
format as the input data. It can also be decided, if only the dump should be used in input, or if the
public trace list shall be scraped in addition. The following command has been used to run the
program. It specifies the path to the input file and defines the desired bounding box. The parameter
‘datasource’ with the value ‘both’ means, that the dump is imported and the OSM trace list is
requested. The options ‘clip’ and ‘elevation’ ensure that GPS-points outside the bounding box are
not written and only points with elevation information are imported to the database. At the end the
database parameters are given as well as the geometry format.
50
http://en.wikipedia.org/wiki/Web_scraping, checked on 20/07/2015
java -jar osmgpxfilter-0.1.jar \ -bbox top=49.459693 left=8.573179 bottom=49.352565 right=8.794050 \ --clip \ --elevation \ --datasource both \ --input C:/baden-wuerttemberg.tar.xz \ --write-pgsql db=osmgpx user=postgres password=XX host=localhost port=5432 \
geometry=linestring
4 Methodology
43
4.4.1.2 OSM Street Network and Land Use Information
The street network and the land use information are extracted from the OpenStreetMap planet file.
This entire planet file contains all map data of the world and has therewith a size of around 30 GB.
Since, only the data of the pilot region is necessary for this scope, the regional extract from the
German state Baden-Württemberg was downloaded51
. The import to the database was undertaken
using the command line Java program Osmosis52
. The tool reads the planet file and its regional
extracts sequentially and writes the data into relations corresponding to the OpenStreetMap data
model. It also provides filter capabilities based on the tags and bounding box. Consequently, only
ways containing either the tag ‘highway’ or the tag ‘landuse’ within the bounding box of the pilot
region were imported. Relations and nodes were rejected to keep the required space in the database
as small as possible and the time needed for the import as short as possible. The command to read,
filter and write the dump to the local database looks as follows:
Prior import, the database schema with the necessary relations must be created. SQL-scripts
containing the creation SQL statements are provided with the program files of Osmosis.
4.4.2 Preprocessing
This section describes the steps, which are done in order to prepare the data for the calculation of
incline. Preprocessing steps are applied to both the GPS traces as well as the street network.
4.4.2.1 GPS data
As described in section 4.3.1.4 the GPS data contains noise and other errors which may degrade the
quality of incline values. As a consequence, the traces are preprocessed to lower the impact of such
irregularities. The preprocessing is implemented in Java and has two steps. Firstly, traces are split
when the distance between two adjacent points or the change in elevation is exceeding a certain
threshold. Secondly, the split traces are smoothed in order to reduce the noise in the elevation
measurements. Figure 19 shows the workflow of the entire process of preprocessing.
51
http://download.geofabrik.de/europe/germany.html, checked on 20/07/2015 52
http://wiki.openstreetmap.org/wiki/Osmosis, checked on 20/07/2015
osmosis \
--read-pbf C:\baden-wuerttemberg-latest.osm.pbf \
--bounding-box top=49.5117 left=8.52791 bottom=49.311 right=8.83534 \
--tag-filter accept-ways highway=* \
--tag-filter accept-ways landuse=* \
--tag-filter reject-relations \
--tag-filter reject-nodes \
--write-pgsimp host="localhost" database="osm" user="steffen" password="xx"
4 Methodology
44
Figure 19: Flowchart of preprocessing the GPS traces
First of all, it is looped through the ordered list of all track points which compose the trace to detect
the point of the trace where it needs to be split. The distance d and the change in height h are
calculated between each point and the adjacent one. If d or h exceeds the given threshold, the
trace will be split into two parts at the detected break point. The first part, from the start of the trace
to the breakpoint, will be stored in a list. The second trace part may still contain errors, therefore
the splitting process starts again with the second trace. This goes on until the trace does not contain
any distances and changes in elevation, greater than the defined threshold. Since on trace, prior
uniquely identified with ‘gpx_id’ and ‘trk_id’, is split, a new ID, the ‘part_id’, is introduced. The
threshold values are chosen to be 300 m for the maximum distance as in Zhang et al. (2010) and for
the maximum h, 10m seems reasonable. As found out in section 5.1.1, the majority of h’s in a
flat area is below this, therefore 10 m can be considered as an error. These values are highly
experimental and changing them may lead to further improvements.
Once, a trace is split in parts, each part will be smoothed. Methods for smoothing time series were
reviewed in section 3.5. A linear smoother is preferred over a non-linear one, since it smooths also
abrupt changes in the elevation, which usually cannot be expected on streets. Consequently, a
weighted moving average algorithm has been implemented. The weights must be defined in
4 Methodology
45
advance and the number of weights must be odd. Note, that also the sum, of all weights must be
equal to one. Considering these condition, the weights can be individually defined by the user.
They are not depending on the distance between, the data points, since a GPS trace is usually
recorded using a certain distance and time interval. Thus, the time series is considered as equidis-
tant. The following set of weights W has been used:
𝑊 = {0.1, 0.125, 0.15, 0.25, 0.15, 0.125, 0.1}
Since the number of weights is seven, each data point will be smoothed, considering the two
preceding and the two following data points. The further away a point is, the less is its impact on
the new smoothed data point. The smoothed value of a point is calculated as follows:
ℎ𝑛∗ = (ℎ𝑛−3 ∗ 𝑤1) + (ℎ𝑛−2 ∗ 𝑤2) + (ℎ𝑛−1 ∗ 𝑤3) + (ℎ𝑛 ∗ 𝑤4) + (ℎ𝑛+1 ∗ 𝑤5) +
(ℎ𝑛+2 ∗ 𝑤6) + (ℎ𝑛+3 ∗ 𝑤7) ,
where
ℎ𝑛∗ = smoothed elevation
ℎ𝑛−𝑖 = original elevation of data points
𝑤𝑖 = weights
The drawback of this approach is, that it cannot be applied to the three end values of each side,
since there no adjacent points on each side. In order not to dismiss them, they are smoothed only
using the adjacent values which exist. For example the first data point will be smoothed using the
first three and the second data point using the first four values in the series. The same is applied to
the values at the end of the series. After smoothing the trace parts, there are written as 3D Lin-
eStrings into a new table in the database (Figure 20). The primary key contains the three id
columns, ‘gpx_id’, ‘trk_id’, ‘part_id’.
Figure 20: Columns of the relation, which stores the preprocessed GPS traces.
4 Methodology
46
4.4.2.2 Street Network
The incline will be calculated for each street segment of the OpenStreetMap street network within
the pilot region. The length of a street segment is often determined by the semantic properties,
instead of the geometry. For example a street segment is as long as the part of a street which has
one name. These segments may sometimes be quite long and span several valleys and peaks. This
is not of advantage for the calculation of incline, since an average incline value is calculated for
each segment. If one segment, for example, contains an incline going up and one incline going
down with the same magnitude, the calculated incline would be 0. To overcome this issue, the
street segments are split into smaller parts. The parts also should not be too short, since this also
decreases the number of GPS points, which can be used for the calculation. Therefore, the streets
are split at the intersection points with other street segments. This was done in QGIS, using the
function ‘Split lines with lines’ of the processing toolbox. Due to the splitting, the osm-id cannot be
used as a unique identifier, therefore, a new id is introduced. The relation in the database looks now
as in Figure 21.
Figure 21: Schema of relation 'streets'.
After the first step, the split street segments have been enhanced with land use information.
Therefore, the function ‘Join attributes by location’ of QGIS was used. In OpenStreetMap it may
happen, that the polygon of the land use does not cover the street. In order to add the land use
which is next to the street, a buffer of 20 meter was applied to the polygon in advance. Figure 22
shows the street network and land use polygons, which do not cover the street. The dashed line
shows the buffered land use polygon. The land use information is stored as a new tag with the key
‘landuse_incline’.
4 Methodology
47
Figure 22: Enhancement of street network with land use information in cases, where land use polygon does not
cover the street segment.
4.4.3 Map Matching
As stated in section 3.4, map matching is an essential step in mining street-attribute information out
of GPS traces. It is important to know, to which street the derived information can be referred. The
map matching approach for this research is implemented in Java and based on the algorithm of
Zhang et al. (2010). Figure 23 shows the workflow of the implemented algorithm.
Figure 23: Flowchart of map matching process.
4 Methodology
48
First of all, the street segments are requested from the database. They are processed sequentially
and for each segment the corresponding GPS traces will be determined. To find candidate traces
from the relation ‘gpx_data_line’ the geometry of the street segment is buffered with a buffer size
of 50 m. All GPS traces are selected as candidate traces which intersect the buffer of the street
geometry. Therefore, it is important that the buffer size is not chosen to be too small, since all
possible traces should be selected. Figure 24a depicts the street segment and its buffer in dark and
light green, and the GPS traces which intersect the buffer in orange.
Now, GPS traces must be dismissed, which were not recorded on the selected street segment. This
is often done by analyzing the distance and the direction between the street segment and GPS trace.
Here, no direction or distance is calculated. For each node of the street segment, a line is created,
that intersects the node and is perpendicular to the current edge of the street segment. For this step
the Java class GeodeticCalculator of the library GeoTools was used. It allows the calculation of
new coordinates with a given reference point and distance as well as direction from it. The length is
chosen to be 30 m like in Zhang et al. (2010), considering the horizontal error of GPS measure-
ments and the width of a street. Figure 24b shows the calculated profile lines in dark blue.
Once the profile lines are created, for each trace of the candidate traces it is checked, how many of
the profiles line are intersected. If a trace intersects most of the profile lines, it means that it follows
the direction of the street segment and is generally not further away from the centerline of the street
than 30 m. The minimum percentage (threshold) of intersected profile line is set to be 70 % for this
research. Lowering the threshold to 50 % would on the one hand give more matches but also more
false positives (not correctly matched), especially in case of street segments with only two nodes.
In this case it may happen that also traces are matched that only intersects with one profile line.
Then it cannot be assumed anymore, that the trace follows the direction of the street. On the other
hand, applying a threshold of 100 % would be too restrictive. Especially, when thinking about
street segment with many nodes. It may always happen that the trace exceeds the distance of 30 m
because of the GPS inaccuracy or the geometric error of the street segment. Therefore, the
threshold has been chosen to be 70 %. Figure 24c shows the GPS traces which intersect at least 3
out of 4 profile lines in red.
4 Methodology
49
Figure 24: The map matching process: Select candidate traces with buffer (light green) of street (dark green) (a),
create profile lines (blue) (b), select traces (red) which intersect at least 70 % of the profile lines.
If the percentage of intersected profile lines is greater than the threshold, the GPS trace is consid-
ered as being recorded on the street segment. The approach by Zhang et al. (2010) includes further
processing in case of parallel streets. If two street segments are parallel to each other it is checked
first, if one of the street segments is a one-way street (e.g. a highway, where each direction is
mapped individually). Then the direction of the traces is calculated and only matched with the
street segment if the direction of the street is similar. When there are parallel streets, which are not
one-way streets, a clustering algorithm is used to identify corresponding traces. This is not
necessary in the case of this research, since it doesn’t matter for the calculation of the incline on
which of the parallel streets the trace was recorded. It is assumed that streets which are parallel
have the same incline. Consequently, there is an ‘n to m’ relationship between the street segments
and GPS traces. One street may thus have multiple GPS traces and one GPS trace may be matched
to multiple street segments. One example of such a case is when footways next to the street are
mapped as individual LineStrings. With this assumption more traces can be matched to the street
segment if there are two parallel geometries. Examples are streets, represented with two geometries
for each direction or street-accompanying footways / bicycle lanes. However, this assumption may
also lead to error, when there are two parallel streets which do not have same incline, such as a
drive way to bridges as shown in Figure 25.
Figure 25: Example for two parallel street, which are do not have the same incline.
4 Methodology
50
After the traces of a street segment street have been found, the IDs referencing street segment and
GPS trace are written to the database. The relation called ‘streets_gpx’ references the relations
‘streets’ and ‘gpx_data_line’ via foreign keys. This is shown in the UML diagram in Figure 26.
Figure 26: The tables 'gpx_data_line', 'streets_gpx' and 'streets' and their relation to each other.
The tool is published and therefore provided to people who may need it. The program is kept as
generic as possible that it also works with other database schemas. The required database relations
and the columns as well as the input parameter mentioned above are specified in a properties file
(Figure 27).
4 Methodology
51
Figure 27: Properties file of map matching tool
#Properties file for map matching ####### street network ####### # name of street table in database t_streetName=streets # column with unique id for each street segment t_streetIdCol=id # column with osm_id (must not be unique, in case street segment were split in preprocessing) t_streetOsmIdCol=osm_id # column with osm tags, stored in hstore t_streetTags=tags # column with geometry. geometry type must be LineString and CRS must be WGS 84 t_streetGeomCol=the_geom ####### gpx input data ####### # name of gpx table in database t_gpxrawName=gpx_data_line # unique id for each GPS trace t_gpxrawIdCol=gpx_id # unique id for each GPS trace t_trkrawIdCol=trk_id # column with geometry. Geometry type must be MultiLineString and CRS must be WGS 84 t_gpxrawGeomCol=geom # default name for output table dbMatchingOutputTable=streets_gpx #buffer in meters (should be equal or bigger than streetProfileLength) streetBuffer=60 # length of profile lines which are fitted through the nodes of the street segments [m] streetProfileLength=30 #ratio of profile line of street which need to be intersected by GPS-trace in order to assume a match streetProfileIntersectionRatio=0.7
4 Methodology
52
4.4.4 Calculation of Incline
After the prior steps, the calculation of the incline of street segments can be done. Like in the map
matching process the street segments are also processed sequentially. The visualized workflow is
depicted in Figure 28.
Figure 28: Workflow for calculating the incline of street segments.
The first step is to request the street segments from the database. After that, the preprocessed
GPS traces (cf. section 4.4.2.1) must be requested from the database. Through the relation
‘streets_gpx’ and a join with the relation ‘gpx_data_line_preprocessed’ they can easily be reqested.
Usually, the GPS traces are span over more than one street segment. Thus, the traces need to be cut,
so that only the track points are selected, which are relevant for the calculation of the incline.
Relevant are those track points, which are near the street segment and not before or beyond it. To
achieve this, a buffer with a size of approximately 30m is calculated. With the buffer, the corre-
sponding GPS traces can be clipped. Consequently, only the part of the GPS traces are used, which
is within the buffer polygon of the street segment.
4 Methodology
53
Figure 29: Clipping of assigned GPS traces.
The process of clipping is performed on the database, since the Java library GeoTools does not
support clipping of 3D geometries. For the selection of the GPS traces and the clipping, following
SQL-statement is executed from the Java-Tool:
Besides the IDs which uniquely identify the GPS trace, the clipped geometry is returned by this
SQL-statement. It has to be noted, that the PostGIS function ST_INTERSECTION returns the
clipped geometries of the GPS trace parts as MultiLineStrings. This has to be considered later,
when the incline is calculated.
SELECT g.gpx_id,
g.trk_id,
g.part_id,
ST_ASGEOJSON(
ST_INTERSECTION(g.geom,ST_GeomFromEWKB('buffered_street_geom'))
) AS geom
FROM streets_gpx sg
LEFT JOIN gpx_data_line_preprocessed g ON
sg.gpx_id = g.gpx_id AND
sg.trk_id = g.trk_id
WHERE
sg.street_id = 'current_street_id'
4 Methodology
54
Once, all corresponding traces with clipped geometry are requested from the database, for each
trace the incline in percent is calculated. Since a GPS trace is a MultiLineString it needs to be split
into LineString objects. The incline, here denoted by m, is the weighted average of all calculated
inclines of the edges of the LineString. The weight is the horizontal length of an edge and it is
normalized with the full horizontal length of the LineString. Since the elevation is given in meters
and the coordinates of the GPS in geographic coordinates, the distance in meters is calculated
considering the earth as a sphere. The calculation of incline can be expressed using following
formula:
𝑚𝑡 = ∑ (ℎ𝑖 − ℎ𝑖+1
𝑑𝑖,𝑖+1) (
𝑑𝑖,𝑖+1
𝑙)
𝑛−1
𝑖=1
,
where
𝑚𝑡 = incline of GPS trace segment in percent
𝑛 = number of track points
ℎ = elevation of track point
𝑑 = horizontal distance between two track points
𝑙 = horizontal length of GPS trace segment
In OSM, the street segments are directed from the first node of the segment to the last one.
Consequently, it has to be checked whether the GPS trace was recorded in the same or opposite
direction of the street. This is done by a comparison of the average bearing of the street segment
and the GPS trace. If the difference in bearing is greater than a certain threshold, it is assumed that
the GPS trace follows the opposite direction and the calculated incline value should be inverted.
Individual samples have shown that due to the geometric accuracy of both, the GPS trace and street
segment, a threshold of 40° is reasonable.
4 Methodology
55
The next step is to combine the incline values of the individual traces to get a value which
represents the incline of the street segments. Since the length of the trace may vary, also here, the
weighted average based on the length of the traces was chosen. Following formula has been used
for averaging the incline values of the individual traces:
𝑚𝑠 = ∑(𝑚𝑎𝑡 ) (
𝑙𝑎
∑ 𝑙𝑎𝑘𝑎=0
)
𝑘
𝑎=0
,
where
𝑚𝑠 = incline of street segment in percent
𝑚𝑡 = incline of GPS trace in percent
𝑘 = number of corresponding traces
𝑙 = horizontal length of trace
After the calculation, the result is written into a new relation, called ‘streets_incline’. Besides the
street-id and the calculated incline, the number of traces which have been used for the calculation,
is added to the table. This supports the estimation of the accuracy, which is described in section 0.
4.5 Validation
In chapter 5, the results will be validated to estimate the achieved accuracy. This will be done by a
comparison with incline values of which have been derived using a high-accuracy DTM. Those
incline values could be calculated out of terrestrial measurements. To acquire test data for a
reasonable amount of streets would be very costly and time-consuming. Therefore, it was decided,
to use incline values calculated from a high accuracy DTM, acquired from LiDAR measurements
(cf. section 4.3.4). To avoid confusion, these incline values will in the following be referred to as
DTM incline, whereas the incline calculated from GPS traces will be noted as GPS incline. With
the use of the DTM, it is possible to calculate an incline value with a reasonable accuracy for all
street segments. Moreover, the results of the thesis shall be compared to incline values calculated
from the SRTM-1 DSM (SRTM incline). This will show how crowdsourced GPS traces perform in
comparison to other globally and freely available data.
The first step of calculating the incline from a DEM which is available in raster format is to densify
the street geometry in order to increase the number of node per street segment. The street segments
will get additional nodes, maximum every three meters. For each node of the densified street
geometry, the absolute elevation is taken from the DEM. The Java library GeoTools provides
classes and functions to easily load georeferenced raster files and to return the elevation for a
4 Methodology
56
specific location from it. Once, each node has an elevation value the incline of the street segment
can be calculated. This is done by averaging the incline values of each edge of the street segment.
Contrary to the calculation of the GPS incline, a non-weighted average is calculated, since the
nodes of the street segment are equidistant. The DTM incline and the SRTM incline are then added
to the relation ‘streets_incline’, which was already created during the calculation of the
GPS incline.
For the calculation of the SRTM incline, the elevation of each node of the street segment is looked
up from the SRTM-1 DSM. The horizontal resolution of SRTM-1 is approximately 30 m. If a street
node is considered to be every 3 m, 10 points in a row will have the same elevation value from the
DSM. This would affect the quality of the SRTM incline. To avoid this, the SRTM-1 DSM was
resampled to a horizontal resolution of 1 m by 1 m, using the ArcGIS function ‘Resample’. This
results in interpolated elevation values for the each 1 m by 1m pixel.
5 Discussion of Results
57
5 Discussion of Results
The implemented methods from the previous chapter were applied to the GPS and street network
data of the pilot region. The outcome will be discussed in the following sections. First of all, the
accuracy of the GPS raw data will be assessed. After that, the calculated GPS incline values will be
validated using a high-accuracy DTM and compared to the incline values derived from the
SRTM-1 DEM.
5.1 Analysis of Crowdsourced GPS traces
As already stated in section 0, a total number of 3842 GPS traces with over two million track points
with elevation information are used for this research. These 3D points, visualized in Figure 30, are
in this section analyzed on vertical accuracy as well as coverage and density.
Figure 30: Screenshot of visualized GPS track points, colorized according to their elevation. (green=low,
red=high)53
5.1.1 Vertical Absolute and Relative Accuracy
In order to be able to judge the accuracy of the data source for the calculation of incline values, the
measured GPS elevation is compared to the elevation from the LiDAR DTM. Firstly, the absolute
error of the GPS track points is calculated, whereas secondly the relative accuracy within the GPS
traces will be evaluated. The relative accuracy refers to the difference of elevation between two
adjacent points. It has to be noted, that the elevation coming from the LiDAR-DTM may also suffer
from inaccuracy. The DTM reflects the terrain of the earth, consequently, all structures on the
53
Screenshot taken from: http://cap4route.geog.uni-heidelberg.de/hd-osm-gps-webgl/hd-osm-gps-
webgl.html, checked on 20/07/2015
5 Discussion of Results
58
earth’s surface, such as buildings, trees or bridges are not contained. The areas where no ground
elevation could be measured are interpolated. This leads to an error, when transferring the DTM
elevation to a GPS track point, which is located a bridge. Furthermore, the horizontal inaccuracy of
the GPS points influences the elevation values from the DTM. Considering that a GPS trace is
recorded on a street directly next to a very steep slope, the GPS track point may fall up to 10 m
next to street, depending on the horizontal error. As a consequence, a wrong elevation value is
transferred from the DTM.
5.1.1.1 Absolute Accuracy
For the assessment of the absolute accuracy the root-mean-square-error (RMSE) has been
calculated overall and for each land use class. The RMSE (the square root of the average of the
squared residuals) is commonly used in spatial analysis and provides a measure of the differences
between GPS and DTM elevations. The residual is the difference of the GPS measurement and the
reference value (DTM). The diagram in Figure 31 shows the RMSE in meters, overall and
differentiated by land use classes. For the calculation of the RMSE, only 90 % of the residuals have
been used while the other 10 % gave ben excluded as outliers. The outliers may come from traces
which were not recorded on the earth’s surface (e.g. from an airplane) or from wrongly calibrated
devices with barometric elevation measurement unit.
Figure 31: Vertical accuracy of crowdsourced GPS traces, distinguished by land use class.
Depending on the land use class, the error ranges from 21 m (farmland) to 35 m (allotments). The
overall RMSE is approximately in the middle with 27 m. According to Liu et al. (2014), the
vertical accuracy can be up to 2.5 times higher than the horizontal one. Considering a horizontal
accuracy of 6 to 10 m (cf. Zhang et al. 2010), the vertical accuracy can be assumed to be 15 m to
25 m. The RMSE of the data evaluated in this study, is only for some land use classes within this
range, however, overall the error is higher. several reasons may lead to this errors. User-generated
traces are likely not to be recorded under very good conditions. One may store the GPS device in a
car, in the pants pocket or in the backpack, while hiking. Under these conditions, an additional
35
34
30
27
25
22
22
21
0 5 10 15 20 25 30 35 40
allotments
commercial
grass
overall
residential
forest
industrial
farmland
RMSE [m]
Lan
d u
se
5 Discussion of Results
59
error through due to may occur. Furthermore, the GPS data may contain traces, which have not
been recorded on the earth’s surface. Traces have been found, which were obviously recorded from
a flying object. An additional reason may be that the diversity of different devices may affect the
result. It can happen that some elevations originate from either barometric measurements or
elevation databases as stated in section 4.3.1. While barometers may be wrongly calibrated and
consequently introduce a systematic error, elevation databases may be referenced to a geoid instead
of the WGS 84 ellipsoid by transforming the ellipsoidal height from the GPS to geoidal height.
This would introduce a systematic error as well and affect the measure of the absolute vertical
accuracy.
The differentiation by land use has been undertaken to evaluate if there is a dependency on the
RMSE. Different land use classes have different characteristics with regard to the obstruction of the
GPS signal. Especially, in forested or residential areas a larger error would be expected, since a
dense tree canopy or high buildings obstruct the view from the GPS receiver to the satellite. In
addition multipath effects may occur when the GPS signal reflect on the façade or windows of
buildings. Contrary, areas of land use classes such as ‘farmland’, ‘allotments’ or ‘grass’ where
expected to have a smaller absolute error. In those areas one could generally find small houses, just
a few trees and wide streets. Consequently, there are not many structures which potentially
influence the GPS signal, therefore a wide and unobstructed view to the sky and the satellites is
possible. As it can be seen in the diagram in Figure 31 this assumption cannot completely be
confirmed in case of OSM GPS traces. The land use classes ‘allotments’ and ‘grass’, of which it
was expected to be less erroneous, showed one of the largest error, whereas GPS track points in
the class ‘forests’ have one of the smallest error. This is very surprising, but a reason may be the
sample size of the land use classes. ‘Forest’ is one of the land use classes with the most GPS track
points, whereas ‘allotments’ and ‘grass’ have less.
Figure 32 depicts a histogram of the differences between GPS and DTM elevations that have been
used to calculate the aforementioned RMSE. The vertical axis shows the total number of points
which fall into the bins represented on the horizontal axis. The width of the bins is two meters. It
can be seen, that data is mainly normally distributed around zero meters. Since, the elevation of the
DTM is referenced to the quasigeoid (cf. section 4.3.4) and the GPS elevation is usually referenced
to the WGS 84 ellipsoid, an offset equivalent to the geoid undulation was expected for most of the
points. The geoid undulation between the WGS 84 ellipsoid and the Earth Gravitational Model 96
(EGM96) in Heidelberg is approximately 48 m54
. Therefore, it can be assumed that most of the
GPS elevations are referenced to a geoid. Most likely, the elevation measurements are internally
54
Calculation done by online geoid calculator under: http://geographiclib.sourceforge.net/cgi-bin/GeoidEval,
checked on 20/07/2015
5 Discussion of Results
60
transformed by the software of the device. In the histogram a second peak at around 48 m can be
found. This represents these GPS elevations which are referenced to the WGS 84 ellipsoid.
Figure 32: Histogram with the differences of GPS and DTM elevation
5.1.1.2 Relative Accuracy
With approximately 27 meters, the absolute accuracy appears to be very large, especially compared
to the vertical accuracy of the SRTM-1 DSM, which is 6.2 m (cf. section 4.3.4). Since, for the
calculation of incline only the difference in elevation of two adjacent track points is used, the
relative accuracy is evaluated in this section. For a GPS track point, the change of elevation hGPS
to the next track point of the trace has been calculated. This value reflects the actual incline of the
terrain including an error caused by the GPS measurement. Since the absolute error is not observed
here, the occurring errors can be considered as noise. To remove the influence of the terrain, the
high-accuracy DTM has been used to calculate the change in elevation hDTM
. The difference of
hGPS
and hDTM
indicates also the actual GPS error eh. Therefore, the following formula has been
used:
𝑒∆ℎ = ∆ℎ𝐺𝑃𝑆 − ∆ℎ𝐷𝑇𝑀
For the assessment of the relative accuracy, the preprocessed and smoothed GPS traces have been
used. eh has been calculated for approximately 1.7 million track points. The box-and-whisker plot
in Figure 33 shows the distribution of eh overall and differentiated between land use classes. The
bottom and top of the boxes represent the first and third quartile, whereas the ends of the whiskers
show the 5th
and the 95th percentile. This happened due to a few but very large outliers present in
the data set.
0
50
100
150
200-3
0
-26
-22
-18
-14
-10 -6 -2 2 6
10
14
18
22
26
30
34
38
42
46
50
54
58
62
66
70
74
78
Fre
qu
ency
tho
usa
nd
s
Differences GPS and DTM Elevation [m]
5 Discussion of Results
61
Figure 33: Relative accuracy of crowdsourced GPS track points, overall and distinguished by land uses.
Overall, the error in the change of elevation is for 50 % of the data within a range of ± 0.16 m. The
whiskers go up to approximately ± 1.00 m. Contrary to the absolute error, it is here obvious that
land use classes which do not suffer from obstructions of buildings and trees perform better than
others. For areas with mainly grass and farmland, the range of the box is within approximately
± 0.12 m (grass) and ± 0.10 m (farmland). For 90 % of the data, eh falls into an interval of ± 0.7
(grass) and ± 0.5 m (farmland). For land use classes which suffer more from obstructions like
‘commercial’, ‘industrial’, ‘residential’ and ‘allotments’, 50 % of the points are influenced by an
error within a range from ± 0.13 m to ± 0.15 m. Although the first and third quartiles of the
aforementioned land uses are similar, the extents of the whiskers vary. While the whisker of the
land use classes ‘industrial’ is within a range of ± 0.6 m, the range increases to approximately
± 0.7 m for ‘commercial’ as well as for ‘allotments’ and even over ± 0.8 m for the land use
‘residential’. The land use class which has the largest extent of errors in this investigation is forest.
The error for 50 % of the data falls into the range of ± 0.26 m, which is more than twice as high as
for farmland. Also, the 5th and 95
th percentiles show a large deviation of eh, with approximately
± 1.5 m. This reflects the obstructions which are present in forest areas, due to the dense tree
canopy.
For the calculation of incline not only the difference in elevation is important, also the distance
between the two adjacent points matter. Therefore, it cannot be judged from the above examined
error eh, how much it influences the incline value. To get an idea of it, all values of eh were
aggregated by land use class and the RMSE was calculated with 90 % of the data. Furthermore, the
-2
-1,5
-1
-0,5
0
0,5
1
1,5
2e
h [
m]
Land use
5 Discussion of Results
62
average distance between two adjacent points within the areas of the specific land use classes have
been used to calculate the effect of the error to the incline. In Table 6, it is shown that the calculat-
ed incline between two adjacent GPS track points contains an error of 2.4 %. Certainly, the result
depends on the land use class. The smallest error occurs in areas of the land use classes ‘grass’ and
‘forest’ with equal or less than 1 %. In forested areas an error of over 4 % can be expected. All
track points within the other areas, are in a range of 2.2 % to 2.7 %.
Land Use RMSE eh [m] Distance [m] ≙ incline [%]
overall 0.3 14 2.4%
allotments 0.2 11 2.2%
commercial 0.2 10 2.4%
farmland 0.2 17 1.0%
forest 0.5 11 4.3%
grass 0.2 29 0.8%
industrial 0.2 10 2.4%
residential 0.3 10 2.7%
Table 6: The effect of the relative accuracy on the calculated incline.
To summarize, it can be said that the relative accuracy depends on the land use class and on its
characteristics regarding obstructions through buildings, trees or other structures. The error sources
which are responsible for this result are mainly shadowing and multipath (cf. section 2.1.2). Other
error sources like ionospheric effects or clock inaccuracies are not affecting the results as much as
the shadowing and multipath. Furthermore, it can be seen from the box-whisker plot in Figure 33
that the error occurs more or less equally in positive and negative direction. Therefore, it may be
assumed that due to the large number of points which are used for the calculation of the incline for
one street segment, the noise will disappear by calculating the average.
5.1.2 Coverage and density
In this section the coverage and density of the OSM GPS traces are examined. This will give an
idea of how many streets data actually exists and how dense the GPS track points are. The coverage
is here investigated using the result of the map matching algorithm implemented for this research.
It has to be noted, that due to the n to m relationship, GPS traces may be matched to more than one
street.
Figure 34 shows a street map of the city of Heidelberg indicating the coverage of streets with
GPS traces. The street segments are colorized according to the number of traces, from green (few
traces) to red (many traces). Street segments visualized in blue do not have any matched GPS
traces. A large share of the street segments are covered with at least one trace, however, many
5 Discussion of Results
63
streets have no matching GPS traces. Streets with over 20 corresponding traces are relatively rare.
Streets with a high traffic volume have the best coverage. There are the motorways (German
Autobahn) in the upper left corner, the streets on the two river sides of the river Neckar (upper right
corner) and the two primary streets in the center of the figure going in North-South direction.
Figure 34: Map, showing the coverage of the streets with GPS traces. (Map: OSM)
The coverage of GPS traces has now been investigated in more detail. It shall later be investigated,
if the number of traces has an effect on the accuracy. Therefore, it is interesting to know how many
corresponding traces can theoretically be used to calculate the incline. Figure 35 shows the share of
street length by street type with different numbers of assigned traces. The street types ‘motorway’,
‘primary’, ‘secondary’ and ‘cycleway’ have a coverage with at least one GPS trace of almost
100 %. Other street types such as ‘residential’, ‘path’ and ‘footway’ are covered with at least one
trace in 42 % to 61 % of the cases. This fact shows that the coverage with GPS traces depends on
the traffic volume and the street type. For example, the motorway, which is probably the street type
with the highest traffic volume, is completely (100%) covered with at least 25 traces, followed by
the street type ‘primary’ which is in the hierarchy of street types below ‘motorway’. The 100 % of
the primary streets have at least 5 GPS traces and it decreases to 70 % with at least 20 traces. With
30 traces, still 20 % of the primary streets are covered. From the streets of the types ‘secondary’
and ‘tertiary’ as well as ‘cycleway’ still around 95% are covered with at least 1 trace and at least 5
traces for approximately 80 % of the streets. The street types with the lowest coverage are
‘residential’, ‘footway’ and ‘path’. But still 42 % to approximately 60 % have at least 5 traces,
whereas this significantly decreases to 15 % with at least 5 traces. With 30 or more traces only less
than 5 % of the streets of the aforementioned types are covered. Generally it can be said, that
streets of higher priority have more matched GPS traces. Streets which can be used for bicycles
5 Discussion of Results
64
perform as good as secondary and tertiary streets. Paths which are dedicated to pedestrians are
comparable to residential streets.
Figure 35: The coverage with GPS traces for different street types.
Besides the coverage, also the density of GPS track points within a GPS trace is examined, using
the preprocessed data set. The term density refers to the distance between two adjacent GPS track
points. A shorter distance between two track points also means that there are more GPS track
points per street segment to calculate the incline. Due to the noise in the GPS data, more points will
make the result more robust. The interval, in which GPS track points are recorded, can usually be
set in the settings of the device. It can be either time-dependent (e.g. every second) or location-
dependent (e.g. every 30 m). Figure 36 shows a diagram of the average distance between two
consecutive track points, distinguished by street type. The type ‘motorways’ is the the street type
with the largest average distance between two points, with approximately 35 m, followed by
‘primary’ and secondary streets (approximately 17 m). In the middle the street types ‘cycleway’
and tertiary have an average distance of almost 14 m. This value also reflects the overall distance.
The street types with the shortest distance are ‘residential’, footway’ and ‘path’. Like in the
investigation of the coverage, a dependency between the distance and the priority of street types
and the average speed can be seen. Whereas a motorway is likely to be the street type with the
highest speed, a footway or path is the one with the lowest since they are used only by pedestrians.
This leads to the assumptions that the main part of the track points are recorded with a time-
dependent interval which results in different distances between two adjacent GPS track points.
With an average distance between two adjacent GPS track points of 14 m, two track points fall into
a one pixel of the SRTM-1 DEM, which is equivalent to the horizontal resolution of 30 m by 30 m.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 5 10 15 20 25 30
Shar
e o
f ls
tre
et
len
gth
Minimum number of traces
cycleway
footway
motorway
path
primary
residential
secondary
tertiary
5 Discussion of Results
65
Figure 36: Average distance of two adjacent GPS track points differentiated by street type.
5.2 Analysis of Calculated Incline
The GPS incline was calculated for a total length of 3064 km street network in the pilot region.
Due to the incomplete coverage of the OSM GPS traces this corresponds to approximately 57 % of
the complete street network, which has a total length of 5338 km. The map in Figure 37 visualizes
the calculated GPS incline for the street network within the pilot region. Streets for which no
GPS incline was calculated are not shown. It can be seen, that the western part of the region is
mainly flat, whereas the eastern part is mountainous. In section 5.1.1, the achieved accuracy of the
calculated GPS incline is examined using the DTM incline. In addition, the accuracy of the
GPS incline will be compared to the incline derived from the SRTM-1 DSM in section 5.2.3. This
will show how the approach of this research performs in comparison to other open-licensed and
globally available data.
Figure 37: Visualization of the GPS incline. Streets with no coverage are not displayed. (Map: OSM)
8 9
10 14 14 14
16 18
36
0,00 5,00 10,00 15,00 20,00 25,00 30,00 35,00 40,00
pathfootway
residentialtertiary
cyclewayoverall
secondaryprimary
motorway
Average distance between two adjacent track points
stre
et
typ
e
5 Discussion of Results
66
5.2.1 Exclusion of data from the evaluation
The error of incline may be influenced by irregularities of the DTM. Figure 38 shows a motorway
junction with the underlying DTM for which the DTM incline is calculated erroneously, with a
value of 30 %. It shows a motorway junction with the underlying DTM. The phenomenon
especially happens at bridges or underpasses, since bridges are partly removed from the DTM. Due
to the characteristics of the pilot region, streets with an incline of more than 20 % are likely to exist
only rarely. But for over 20 km of the street segments, the DTM incline is above 20 % and in some
cases even over 100 %. 20 km corresponds to not even 1 % of the entire street network, but still
influences the result due to the high magnitudes. Therefore all street segments which have a
DTM incline greater than 20 % will be excluded from the evaluation to make sure that the result is
not distorted by wrong reference data. Furthermore, street segments with a GPS incline and
SRTM incline of over 35 % are not considered for this evaluation. The steepest street in the world,
the Baldwin Street in Dunedin (New Zealand), has a maximum incline of approximately 35 %55
.
Apart from a few and small paths in a mountainous region, streets with inclines over 35 % can
practically not exist and are consequently considered as wrong calculation and will also not be used
for this evaluation.
Figure 38: Erroneously calculated DTM incline, due to irregularities of the LiDAR DTM.
5.2.2 Accuracy of GPS incline
The comparison of the GPS incline and DTM incline is realized through the calculation of the
difference. It results in the incline error em which is given in percent. Following formula was used:
𝑒𝑚 = 𝑚𝐺𝑃𝑆 − 𝑚𝐷𝑇𝑀 ,
where
𝑒𝑚 = incline error in %
𝑚𝐺𝑃𝑆 = incline, derived from GPS traces
𝑚𝐷𝑇𝑀 = incline, derived from DTM
55
https://de.wikipedia.org/wiki/Baldwin_Street, checked on 20/07/2015
5 Discussion of Results
67
In the following sections, the error will be be evaluated overall, differentiated by land use classes as
well as by different types of terrain such as flat and mountainous. Furthermore, it will be investi-
gated if the number of traces which has been used to derive the GPS incline affects the accuracy.
5.2.2.1 Overall error
Figure 39 shows the street network colorized by the incline error. There are 5 error classes, starting
from smaller than 1% up to greater than 5 %, shown in a gradient from green (small error) to red
(large error). Due to the incomplete coverage of GPS traces, the incline could not be calculated for
all streets. Those are not shown in the map. It can be seen from the map, that street segments
having a medium or large error are not equally distributed in this area. It is noticeable that on the
western part only a few and short street segments are colored in yellow or red. As stated in section
4.1 the western part is mainly characterized by flat terrain and farmland. Contrary to the western
part, in the eastern part more streets with a larger error can be found. This area is part of the
“Odenwald”, a mountainous region with mountains up to 600 m and mainly forested areas.
Figure 39: Visualization of the error of GPS incline in the pilot region. (Map: OSM)
5 Discussion of Results
68
In the following, the distribution of the incline error is investigated. The histogram in Figure 40
shows the distribution of the incline error in percent within a range of ± 13 %. The vertical axis
represents the relative frequency of error values, falling into bins, which have a width of 0.5 %.
Errors with a magnitude greater than 13 % are too few to visualize them and are therefore not
shown in the histogram. The error of the incline appears to be normally distributed around the
mean of -0.05 m. The standard deviation is σ = 3.97 %, which means that approximately 68 % of
the incline errors are within the range of ± σ. When recalculating the standard deviation with only
95 % of the data, it results in σ = 2.31 %, which is almost as half as much as calculated with 100 %
of the data. This means, that there are GPS incline values which have been calculated with a large
error.
Figure 40: Histogram of the overall incline error in percent and the bell-curve (red).
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
-13-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Fre
qu
en
cy
Incline Error [%]
5 Discussion of Results
69
Table 7 shows the absolute length in kilometers of the street segments for which the incline was
calculated below the error. In addition, the percentage of the total length of street segments for
which it was possible to derive incline information (3064 km). Depending on the application which
uses incline information, the acceptable error may be different. The higher the acceptable error the
more street segments are available. For almost two-thirds (61 %) of the street network the
GPS incline can be derived with an error smaller than 1 %. This increases to 85 % considering an
error up to 3 % or even to 92 % if an error smaller than 5 % can be accepted.
Incline Error em absolute length of
street segments
share of length of
street network with
GPS incline
< 1% 1872 km 61 %
< 2 % 2370 km 77 %
< 3 % 2607 km 85 %
< 4 % 2731 km 89 %
< 5 % 2817 km 92 %
Table 7: The length of street segments for different incline error classes.
5.2.2.2 By Land Use Classes
Like in section 5.1.1, the influence of the land use classes on the accuracy of the GPS incline is
investigated. The standard deviation has been calculated for each land use class and is shown in
Table 8. In addition, the length in kilometer and the percentage of the street segments within the
land use classes is given. The land use class ‘forest’ has the highest standard deviation with σ =
5.6 %, whereas for street segments running through farmland and industrial areas the GPS incline
could be calculated with the best accuracy (σ = 2.2 % / 2.3 %). The standard deviation of the other
land use classes, are in the middle with σ = 3.0 for ‘allotments’ to σ = 3.8 % in residential areas.
The reason may be the different characteristics of the land use classes with regard to the obstruction
of the signals from the satellite to the GPS receiver. Like in the previous section, the standard
deviation can be improved for all land use classes using only 95 % of the data. This shows that in
all land use classes, GPS incline values with errors of high magnitude can be found since the
standard deviation decreases in all cases.
5 Discussion of Results
70
Land Use Class Std Dev. [%] Std Dev. 95 % [%] length
[km] length [%]
overall 4.0 2.3 3064 100
forest 5.4 3.6 1072 35
residential 3.8 2.1 844 28
farmland 2.2 1.1 560 18
grass 3.7 2.0 187 6
allotments 3.0 1.7 102 3
commercial 3.1 1.6 53 2
industrial 2.3 1.7 48 2
Table 8: The achieved accuracy of GPS incline differentiated by land use classes.
The achieved accuracies of the GPS incline differentiated by land use classes reflect the relative
accuracy of the GPS track points, evaluated in section 5.1.1.2. Forested areas perform worst,
whereas the land use class ‘farmland’ is one of the land use classes with the highest relative
accuracy. Differences compared to the relative accuracy of the GPS raw data can be found for
‘grass’ and ‘industrial’. For the land use class ‘grass’, the relative accuracy of the GPS track points
is the best, whereas the incline could only be determined with medium accuracy. In industrial areas,
the opposite case can be observed. For street segments within industrial areas the GPS incline could
be calculated with one of the best accuracies. Contrary, the relative accuracy of the GPS traces
within industrial areas is not as good. The reason may be found in the lack of data. Street segments
within the land use classes ’grass’, ‘allotments’, ‘commercial’ and industrial’ are only a small part
of the entire street network of street segments with calculated GPS incline. This may happen due to
missing GPS information in these areas or due to fewer streets. The result would be more reliable
and robust if more data would be present.
5.2.2.3 By Terrain Classes (mountainous / flat)
Besides the influence of the land use class on the achieved accuracy, it is also evaluated if the
terrain affects the accuracy of the GPS incline. It shall be differentiated between street segments in
flat and mountainous areas. A street segment is considered as flat if the DTM incline is smaller
than 2 %. Streets with an incline of over 5 % are considered as being in a mountainous area. The
results are shown in Table 9. The length of the flat street segment sums up to 2018 km which
represents the 66 % of all street segments for which the GPS incline could be calculated. The
length of streets in mountainous areas accumulates to 603 km, which corresponds to 20 %.
Consequently, the missing 14 % are street segments with a DTM incline ranging from 2 % to 5 %.
This part of street segments is not considered in this section.
5 Discussion of Results
71
As already stated in sections 5.2.2.1 and 5.2.2.2, the overall standard deviation is σ = 4.0 %. Within
flat areas the standard deviation is 2.8 %, whereas in mountainous areas a standard deviation of
more than 7 % is calculated. Using only 95 % of the data it can be improved to σ = 1.4 % (flat) and
σ = 5.7 % (mountainous). Since the standard deviation of the GPS incline value of street segments
within mountainous is worse more than 3 times, it can be said that the incline within flat areas can
be determined with higher accuracy than in mountainous areas. However, it has to be noted that the
majority (73 %) of the street segments in mountainous areas run through forested areas, whereas
only 19 % of streets in flat areas fall in forested areas. Thus, the result may also be influenced by
the poor accuracy of incline values within forests (cf. section 5.2.2.2).
Terrain Class Std Dev. [%] Std Dev. 95 % [%] length [km] length [%]
overall 4.0 2.3 3064 100
flat 2.8 1.4 2018 66
mountainous 7.2 5.7 603 20
Table 9: The achieved accuracy of GPS incline differentiated by terrain classes.
5.2.2.4 Effect of Number of GPS Traces on Overall Accuracy
In the previous sections, the investigations have been made considering all street segments for
which the GPS incline could be calculated and not depending on the number of traces which have
been used to calculate the incline. As described in section 0, the GPS incline is calculated out of the
elevation differences of the track points of a GPS trace. If a street segment has multiple traces, the
incline is calculated individually for each trace. The incline values of the individual traces are then
aggregated by calculating the average. It is now evaluated if the accuracy increases with the
number of GPS traces per street segment. The diagram in Figure 41 shows the percentage of the
street network with an incline error smaller than 2 % in blue, considering the usage of a certain
number of GPS traces. 100 % is equivalent to the sum of the lengths of all street segments having
at least the number of matched GPS traces. The red line shows the share of the entire street network
including also those street segments for which the GPS incline could not be derived.
It was possible to derive the GPS incline with an error smaller than 2 % for 2370 km (77 %) of the
street length considering street segments with at least one GPS trace. This is equivalent to 44 % of
the total street network. When neglecting street segments with less than 5 GPS traces, 1128 km of
the street network are covered. Out of these 1128 km, 87 % of the street length has a GPS incline
with an error smaller than 2 %. Compared to the usage of at least 1 trace, the percentage of streets
increases, however, the coverage compared to the total street network decreases significantly to
less than 20 %. This trend continuous the more traces are used. When considering street segments
with at least 30 traces, the percentage of street length with an error smaller than 2 % increases to
5 Discussion of Results
72
98 %, although with only 133 km (corresponding to only 2.5 % of the total street network in the
pilot region) there are not many street segments covered with at least 30 GPS traces.
Figure 41: The percentage of streets, with an incline error smaller than 2 % and their share with respect to the
entire street network.
5.2.3 Comparison GPS incline and SRTM incline
The derived GPS incline using the approach presented in this thesis is now compared to the SRTM-
1 DSM, as this is an alternative data source for deriving incline information. As described
previously in section 4.3.4 the DSM is freely available with a horizontal resolution of approximate-
ly 1 arcsecond, which is equivalent to approximately 30 m at the equator.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 5 10 15 20 25 30
Minimum Number of Traces
share of streetlength with GPSincline error <2% [%]
share of totalstreet network[%]
5 Discussion of Results
73
The SRTM incline of the street network was calculated as described in section 4.5 and the standard
deviation has been calculated like it was done for the GPS incline. This gives a comparable
measure which is suitable to indicate how the accuracy of the GPS incline performs in comparison
to the SRTM incline. Table 10 gives an overview of how much of the street network, the incline
can be determined with an error smaller than 2 % considering both data sources (GPS and SRTM).
The numbers for the GPS incline are taken from the diagram in section 5.2.2.4.
Percentage of street network with
incline error smaller < 2 % coverage
SRTM 73 % 100 %
GPS, ≥ 1 trace 77 % 44 %
GPS, ≥ 5 traces 86 % 18 %
Table 10: Comparison of SRTM and GPS incline in terms of amount of street network with an incline error
smaller than 2 %.
Using SRTM, it is possible to derive the incline with an error smaller than 2 % for 2236 km out of
3064 km (73 %) of the street network. When using GPS traces, this depends on the minimum
number of GPS traces which are used for the determination of incline. Considering street segments
with at least 1 trace, 77 % percentage of the street network can be determined with an incline error
smaller than 2 %. When neglecting streets with less than 5 GPS traces, the percentage increases to
86 %. However, this requires the coverage of enough GPS traces, which is the case in only 18 % of
the entire street network in the pilot region. To summarize, it can be said that GPS incline performs
slightly better than the SRTM incline, however, the coverage is more complete with SRTM. But it
has to be noted as well, that GPS traces can always be collected by volunteers, thus the coverage
may get higher.
Besides the length of the street of which the incline was derived within a certain error range, the
standard deviation is compared in the following sections. Firstly, the standard deviations of the
GPS and SRTM incline are differentiated by land use class and secondly by terrain classes.
5.2.3.1 By Land Use Classes
Table 11 shows the comparison of the standard deviations by land use classes. The standard
deviations have been calculated using 95 % of the data. Besides the standard deviation of the
SRTM incline σSRTM, the one of the GPS incline σGPS (cf. section 5.2.2.2) and their difference is
shown. Additionally, the standard deviation has been calculated from the error values of the
GPS incline, considering only those street segments, which have at least 5 GPS traces (σGPS 5T). The
difference between σSRTM and σGPS 5T is given in the last column. Overall, σSRTM is with 3.1 %,
5 Discussion of Results
74
0.8 % larger than σGPS and even 1.5 % larger than σGPS 5T. That means that the GPS incline can be
derived with less uncertainty, especially when neglecting street segments with less than 5
GPS traces. This holds true within all land use classes. Considering the GPS incline, derived from
at least 5 traces, the difference of the standard deviations is for all land use classes larger than 1 %,
reaching almost 2 % in the land use class ‘grass’.
Land Use
Class σSRTM
[%] σGPS
[%] σGPS - σSRTM
[%] σGPS 5T
[%] σGPS 5T - σSRTM
[%]
overall 3.1 2.3 -0.8 1.6 -1.5
forest 4.2 3.6 -0.6 3.1 -1.1
farmland 2.0 1.1 -0.9 0.9 -1.1
residential 2.8 2.1 -0.7 1.5 -1.3
commercial 2.9 1.6 -1.3 1.5 -1.4
allotments 2.6 1.7 -0.9 1.3 -1.3
grass 3.3 2.0 -1.3 1.5 -1.8
industrial 2.6 1.7 -0.9 1.0 -1.6
Table 11: Comparison of the standard deviations of the incline error, overall and differentiated by land use
classes.
The SRTM incline may perform worse because of several reasons. Firstly, the SRTM-1 DEM is a
DSM which means that all structures on the earth surface are not reduced from the elevation
information. This normally does not matter, since streets, apart from those in forests or under
bridges, are hardly covered by trees or man-made structures. But due to the low horizontal
resolution of 30 m this becomes a problem. If one square (or pixel) of the DSM covers not only the
street, but also building and trees which are right next to the streets, the value of this square is an
average elevation of the ground and the other structure. Contrary to SRTM, the GPS track points
are recorded on the earth surface or to be more precise on a constant height above the ground (in
the car or in the back pack). In addition to this problem, the SRTM data suffers from a vertical
accuracy of 6.2 m, which is a relatively large error in comparison to the relative accuracy of GPS
with 0.6 m within a distance of 30 m (cf. section 5.1.1.2).
5 Discussion of Results
75
5.2.3.2 By Terrain Classes
The standard deviations are now differentiated between flat areas (DTM incline < 2 %) and
mountainous areas (DTM incline > 5 %). As shown in Table 12, the SRTM incline performs better
in flat areas (σSRTM=2.5 %) than in mountainous areas (σSRTM=5.1 %). The GPS incline performs
better as the SRTM incline in flat areas, whereas in mountainous areas the SRTM incline is slightly
better. This is surprising since the GPS incline could be derived more accurately overall, within all
land use classes as well as in flat areas. The reason why the SRTM incline is slightly better may not
be because the SRTM incline could actually be determined more accurately, but the GPS incline
performs in mountainous regions extraordinary badly.
Terrain
σSRTM
[%] σGPS
[%] σGPS - σSRTM
[%] σGPS 5T [%]
σGPS 5T - σSRTM
[%]
overall 3.1 2.3 -0.8 1.6 -1.5
flat 2.5 1.4 -1.1 1.0 -1.5
mountainous 5.1 5.7 0.6 5.4 0.3
Table 12: Comparison of the standard deviations of the incline, overall and differentiated by terrain classes.
5.3 Limitations of Approach
As describes in section 0, the approach of deriving incline information from user-generated
GPS traces results in an incline with a reasonable accuracy, however, due to the methodology there
are also limitations. In this section, these limitations will be discussed critically.
In the OpenStreetMap Wiki (2015e), a convention regarding the incline of streets is given. When
adding incline information to OSM-Ways, the street segment shall be split at the beginning and at
the end of the inclined part. The value which is then added to the key ‘incline’, should represent the
maximum value which can be found within this part of street rather than the average incline. But
using the approach of this thesis, the average incline is calculated per street segment. In the
preprocessing, the street segments were split at their intersection points. However, those parts of
the streets in which there is an incline are not detected. Thus, the geometry objects cannot be split
at the beginning and at the end of the inclined part of the street.
The calculation of the average incline per street segment and that the steepest parts are not
detected, does not result in a problem as long as the street segment contains a constant incline over
the length of the street segment. However, in reality there are situations in which this approach
leads to wrong results. Figure 42 shows two examples, (a) and (b). In (a) the street segment
contains 3 parts with different inclines. Two of them are flat, whereas the one in the middle is
5 Discussion of Results
76
inclined. The average incline, calculated for this street segment, results in a value which is lower
than the incline in reality. This leads to a problem if a person expects for example a 5 % incline
along a distance of 100 m distance and faces in reality a 10 % incline within a distance of 50 m. At
least, in this case it is known that there is an incline. The example in Figure 42(b) shows a situation
in which the average incline results in 0 %, since the street segments contains two inclined parts
with the same magnitude but in the opposite direction.
Figure 42: Situations where the calculated incline differs from the steepest incline.56
This problem was not addressed in this thesis as the main focus was on the examination of user-
generated GPS traces with regard to their feasibility of deriving incline information as well as the
development of a method and tools which handle the GPS data and process them to derive incline
information.
56
Bicycle pictogram: © Pixabay-User: ‘ClkerFreeVectorImages‘, Source of image:
https://pixabay.com/de/fahrrad-piktogramm-sport-307977/, checked on 20/07/2015
(a) (b)
6 Conclusion and Outlook
77
6 Conclusion and Outlook
6.1 Conclusion
Different user-groups may benefit from routing planning which considers the incline of a street
network. There are for example mobility-restricted people, such as wheelchair users, people with
walking aids or even parents with push chairs, for whom streets or paths may be inaccessible if
there is an incline of certain magnitude. Knowing the incline in advance, a route can be planned
with avoiding steep streets. The chosen route may be longer, but not as steep as the shortest one.
Furthermore, it is useful for route planning of electricity-powered vehicles or bicycles.
The data of the OpenStreetMap project, which is a freely available source of street network data
and often used by routing engines, does only provide incline information for 0.2 % of the street
network. Therefore, the automatic derivation of incline values may fill the gap. One source of
elevation information to derive incline information for a street network may be digital elevation
models (DEMs). There are DEMs acquired from LiDAR-measurements. These are very accurate,
however, there are usually also very expensive and not globally available. Alternatively, low-cost
DEMs like SRTM-1 DEM or ASTER are freely and (almost) globally available but are limited
through their horizontal resolution of 30 m and vertical accuracy of 9 m (ASTER GDEM) and 6 m
(SRTM-1 DEM). Another source of elevation data, which is freely available and at least theoreti-
cally globally available, are user-generated GPS traces of the OpenStreetMap project. Initially
collected for the purpose of map making, the data might also serve other purposes. Contrary to
SRTM-1 and ASTER and depending on the coverage, many GPS track points may fall within a
square of 30 m, which is the horizontal resolution of SRTM-1 DEM and ASTER DGEM. There-
fore, there is more information about the elevation which potentially results in incline values of
higher accuracy, although the absolute vertical accuracy of GPS is known to be fairly poor. But
rather than the absolute elevation, only the difference in elevation of two adjacent points is of
relevance. The relative accuracy is assumed to be better than the absolute one.
6 Conclusion and Outlook
78
The following aims for this thesis have been formulated:
- Creation and implementation of a workflow to calculate the incline of streets, using user-
contributed GPS traces.
- Assessment of the quality of voluntary collected GPS traces in terms of
o vertical accuracy (absolute and relative)
o coverage of GPS traces
- Assessment of the achieved quality of the incline information, compared to LiDAR and
SRTM-1 DEM.
- Publication of developed software as Open Source and provision to the OpenStreetMap
community
The steps to fulfill the aforementioned aims will be discussed in the following.
Before calculating the GPS incline for the segments of the street network, different steps have to be
undertaken to prepare the two main input data sets, the GPS traces and the street network. The GPS
traces are downloaded from the OpenStreetMap (OSM) project and include over 4000 traces in the
pilot region (Heidelberg Area / Germany). Not all of the traces have the optional elevation
information, therefore, only 3842 traces with over 2 million GPS track points remain to derive the
incline. To import the GPS traces which can be downloaded as compressed file-archive (*.tar.xz), a
Java-Tool has been developed. It reads the file-archive and filters the GPS traces by bounding box,
rejects all traces without elevation information and stores the traces in the database. Since, the
elevation information of the GPS traces suffer from noise and other irregularities, they have to be
preprocessed in the following step. The street network which is going to be enhanced with incline
information, has also been taken from OSM. The data has been imported to the database using the
Java-tool Osmosis. For the pilot region the street network has a total length of 5338 km, containing
different types of streets. It includes for example residential streets (18 %) and motorways (2 %)
but also paths which are exclusively dedicated to pedestrians or cyclists (together 20 %). Like the
GPS traces, also the street network has been preprocessed. The streets have been split at their
intersection points with other streets to avoid long streets which may span several valleys and hills.
It is considered that the GPS traces were recorded while traveling on a street, which is important
for the next step. For the incline calculation, the assignment of the GPS traces to the street
segments (map matching) is an essential step. The assumption has been made, that streets which
are parallel to each other (e.g. street with two separate lanes, footpath next to street) also have the
same incline. Consequently, GPS traces which were recorded on one of the parallel streets can also
be used for the incline calculation of the other one. This increases the number of traces per street,
however, this assumption may also lead to errors if two parallel streets have different inclines.
6 Conclusion and Outlook
79
The GPS incline calculation was done for each street segment individually. First of all, the
previously assigned GPS traces of the street are selected. A buffer of the street segments with a size
of 15 m is then used to clip the selected traces. Only those parts of the traces which fall into the
buffer shall be used to calculate the incline. After that, the incline for each GPS trace is calculated
by averaging all inclines derived from every two adjacent GPS track points. If there are multiple
traces per street segment, the procedure is repeated for the other ones as well. At the end, the
incline values of all traces are averaged to the final incline of the street segment.
The second aim of this thesis is the evaluation of the GPS raw data with regard to the absolute and
relative accuracy considering a LiDAR DTM as reference. Overall, the RMSE (using 90 % of the
data) of the GPS-elevation is 27 m and depending on the land use class it ranges from 21 m for
GPS track points in farmland and 35 m in the land use class ‘allotments’. This is worse than stated
in the literature in which the accuracy of low-cost GPS receivers has been assessed. It may happen,
that smartphone apps use elevation databases which rely on DEMs, such as SRTM-1. Furthermore,
some handheld devices have a barometric measurement unit. Furthermore, the GPS elevation refers
in many cases to the mean sea level, although it should be given as the height over the WGS 84
ellipsoid. Only some GPS track points are referred to the ellipsoid. This means that many
smartphone applications or handheld GPS devices internally transform the ellipsoidal to geoidal
height with the help of a geoid model.
To judge the relative accuracy, the RMSE of the elevation differences between two adjacent GPS
track points has been calculated. Overall, the RMSE is 0.3 m, however, depending on the land use
class the RMSE ranges from 0.2 m to 0.5 m. Land use classes which are characterized by mainly
fields and almost no buildings like ‘farmland’ or ‘allotments’ perform with an RMSE of 0.2 m
better than others which are characterized by tall buildings or a dense tree canopy such as residen-
tial areas or forests (RMSE = 0.3 m / 0.5 m). Combining the RMSE with the average distance
between the points, it results in an incline error of 2.4 % overall, 1.0 % for farmland and 4.3 % for
forested areas. With an incline accuracy of 2.4 % it is possible to derive incline out of GPS traces
with a reasonable accuracy, especially considering that street often are covered with traces.
Besides the absolute and relative accuracy, the coverage of GPS traces and density of GPS track
points has been evaluated. The coverage was investigated by street type. When considering all
street types which are used by cars, it can be said, that in the pilot region streets of higher priority
also have a higher coverage. With at least one GPS trace, 100 % of the motorways, primary and
secondary streets are covered, while residential streets are only covered with 60 %. Considering the
coverage of at least 5 GPS traces, still almost 100 % of the motorways and primary streets, 82 % of
the secondary and only 14 % of the residential streets are covered. The street types ‘path’ and
‘footway’, which are used by pedestrians and also mobility-restricted people are comparably
6 Conclusion and Outlook
80
covered than residential streets. Cycle ways even have a better coverage, which can be compared to
secondary streets. This shows that the contributors of OSM are not only traveling by car, there are
also many paths covered which are dedicated to pedestrians. This is of benefit, considering the
motivation of this thesis of calculating the incline for mobility-restricted people.
The overall distance between two adjacent points of the GPS traces is 14 m. Compared to the
horizontal resolution of SRTM-1 DEM, this is as twice as high and 2 GPS track points theoretically
fall into one pixel of the DEM. This results in a higher horizontal resolution, especially if a street is
covered by multiple. The distance between two GPS track points depends on the average speed on
the street type. For example is the average distance on motorways 36 m, while on foot ways the
average distance is only 9 m. It implies, that most devices record the GPS track points in a time-
dependent interval.
The third aim of this thesis is to validate the result, using incline derived from the high-accuracy
DTM as reference and the SRTM-1 DSM. The incline was calculated for 3064 km street length
which is equivalent to 57 % of the entire street network within the pilot region. Out of this street
length, 61 % have an incline error smaller than 1 %, which is probably a sufficient accuracy for
most use-cases. For even 85 % (2607 km) of the street network, the incline could be calculated
with an error below 3 %, which still may reasonable for some use-cases. The normal distributed
incline error has a standard deviation of σ = 2.3 % (with 95 % of the data), but depending on the
land use classes σ is ranging from σ = 1.1 % (farmland) to σ = 3.6 % (forest). It is noticeable that
the incline is more accurate within land use classes which do not suffer from obstructions of the
satellite signal. If differentiating the incline accuracy by terrain classes, it was discovered that the
incline can be determined with higher accuracy in flat areas (σ = 1.4 %), whereas in mountainous
areas the accuracy is worse with σ = 5.7 %. However, the main part of the mountainous area is
characterized by forests, which also performs worse than other land use classes.
The accuracy can generally be improved with only considering street segments which are covered
by multiple traces. For example, if all street segments with a GPS incline are considered, 77 % of
the inclines are determined with an accuracy better than 2 %. With an increasing minimum number
of traces, the percentage of streets with an incline error below 2 % increases to 87 % (>5 traces)
resp. 92 % (>10 traces). However, the coverage gets significantly worse.
The GPS incline was compared to the incline derived from the SRTM-1 DSM to see how user-
generated GPS traces perform in comparison to other freely available data. The evaluation has
shown that the GPS incline performs slightly better than the SRTM incline. Using SRTM-1, the
incline could be determined with an error smaller than 2 % for 73 % of the street network. With
GPS traces this increases to 77 %, considering all street segments with at least 1 GPS trace and to
6 Conclusion and Outlook
81
86 % of the street with at least 5 GPS traces. However, the coverage of GPS cannot keep up with
the SRTM-1 DEM which is almost globally available. The percentage of streets with an GPS
incline smaller than 2 % is only 44 % (> 1 GPS trace) resp. 18 % (> 5 GPS traces).
To conclude it can be said, that it is possible to derive incline values of a street network in a
reasonable accuracy, if the streets are covered with multiple GPS traces. Especially in comparison
to SRTM-1 DSM, the GPS incline performs better, although the coverage is significantly lower.
With introducing other sources of user-generated GPS traces, the coverage can be improved. The
result shows, that it is nowadays possible to achieve a comparable or even slightly better results
with user-generated data, compared to data collected by a research satellite. However, user-
generated GPS traces also require satellites, but the data was not primarily collected for the purpose
of incline calculation.
6.2 Outlook
This approach has been tested in the area around Heidelberg. The advantage of this area is that
there is a diversity of land use classes as well as flat and mountainous areas. However, the
mountainous area is mainly covered by forests and the residential areas are often in flat terrain. To
do further tests regarding the dependency of mountainous and flat areas on the accuracy of
GPS incline, the approach could be applied to other pilot regions, for example Zürich in Switzer-
land, where a high-accuracy DTM is available as open data.
Furthermore, it can be tried in the future to improve the results of this approach. One way to
achieve better results is the introduction of other sources on top of OpenStreetMap. The data of
sport tracking platforms and projects such as Strava or gpsies.com could be combined with the
GPS traces from OSM. Unfortunately, some projects do not offer public access to the GPS traces,
but with requesting the data for a specific reason or setting up a cooperation, it might be possible to
get anonymized data. The larger amount of data would result in a higher coverage of GPS traces
which leads to a higher completeness of streets with GPS incline and to higher accuracies, since
there will be more streets with multiple corresponding traces. If there is a limited and relatively
small area of interest, it is also possible to collect the data only for this reason by volunteers.
Furthermore, a smartphone can be given to people who regularly drive or walk through the area
(e.g. couriers, taxi), to record their location the entire work day.
An additional field of further research would be to compute a digital terrain model out of user-
generated GPS traces. Massad & Dalyot (2015) already did investigations towards the generation
of a DTM, using GPS traces recorded from a smartphone. They tested their approach, which
includes a 2D Kalman filtering, on a university campus with data collected exclusively for this
6 Conclusion and Outlook
82
purpose and under good conditions. The measured GPS track points are relatively equally
distributed, since not only points on paths have been measured, but also on lawn and parking
spaces. The approach of Massad & Dalyot (2015) could be tested using the GPS data of OSM. It
offers a large amount of data, however, it also involves some challenges. In case of OSM
GPS traces, it is not known, which devices were used, to which vertical datum the elevation is
referred to and the traces are mainly recorded on streets or paths. The latter would lead to a gap of
data in the areas between the streets. Furthermore, streets often have multiple traces what leads to a
high density of points which may differ a lot in elevation due to their poor absolute accuracy.
Because of the aforementioned challenges it would be interesting to find out if the OSM GPS traces
are suitable for deriving a digital terrain model.
6 Conclusion and Outlook
83
7 Bibliography
Bachofer, F. (2011): Einfluss der vertikalen Genauigkeit von DGM aus das EcoRouting von
Elektrofahrzeugen. In J. Strobl, T. Blaschke, G. Griesebner (Eds.): Angewandte Geoin-
formatik 2011. Beiträge zum 23. AGIT-Symposium Salzburg. Berlin, Offenbach: Wich-
mann, pp. 338–346.
Bauer, C. A. (2010): User Generated Content – Urheberrechtliche Zulässigkeit nutzergenerierter
Medieninhalte. In H. Große Ruse-Khan, N. Klass, S. von Lewinski (Eds.): Nutzergenerierte
Inhalte als Gegenstand des Privatrechts, vol. 15. Berlin, Heidelberg: Springer, pp. 1–42.
Boucher, C. (2013): Fusion of GPS, OSM and DEM Data for Estimating Road Network Elevation.
In : Fifth International Conference on Computational Intelligence, Communication Systems
and Networks (CICSyN). Madrid, Spain, pp. 273–278.
Cartwright, W.; Gartner, G.; Meng, L.; Peterson, M. P.; Peckham, R. J.; Jordan, G. (2007): Digital
Terrain Modelling. Berlin, Heidelberg: Springer.
Conley, R.; Cosentino, R.; Hegarty, C. J.; Kaplan, E. D.; Leva, J. L.; Uijt de Haag, M.; Van Dyke,
K. (2006): Performance on Stand-Alone GPS. In E. D. Kaplan, C. Hegarty (Eds.): Under-
standing GPS. Principles and applications. 2nd edition. Boston: Artech House, pp. 301–378.
Cosentino, R. J.; Diggle, D. W.; Uijt de Haag, M.; Hegarty, C. J.; Milbert, D.; Nagle, J. (2006):
Differential GPS. In E. D. Kaplan, C. Hegarty (Eds.): Understanding GPS. Principles and
applications. 2nd edition. Boston: Artech House, pp. 379–458.
Czegka, W.; Braune, S.; Behrends, K. (2004): Die Qualität der SRTM-90m Höhendaten und ihre
Verwendbarkeit in GIS. 24. Wissenschaftlich-Technische Tagung der DGPF. Halle, 2004.
Ding, D.; Parmanto, B.; Karimi, H. A.; Roongpiboonsopit, D.; Pramana, G.; Conahan, T.;
Kasemsuppakorn, P. (2007): Design considerations for a personalized wheelchair navigation
system. In Conference proceedings: Annual International Conference of the IEEE Engineer-
ing in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society.
Annual Conference 2007, pp. 4790–4793.
empirica Gesellschaft für Kommunikations- und Technologieforschung mbH (2015): Welcome to
cap4access. Available online at http://cap4access.eu/intro/, checked on 1/15/2015.
European Space Agency (2015): What is Galileo? Available online at
http://www.esa.int/Our_Activities/Navigation/The_future_-_Galileo/What_is_Galileo,
checked on 4/23/2015.
6 Conclusion and Outlook
84
Farr, T. G.; Rosen, P. A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S. et al. (2007): The Shuttle
Radar Topography Mission. In Reviews of Geophysics 45 (2). DOI:
10.1029/2005RG000183.
Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. (1996): From Data Mining to Knowledge Discovery
in Databases. In U. M. Fayyad (Ed.): Advances in knowledge discovery and data mining.
Menlo Park: AAAI Press, pp. 1–34.
Feairheller, S.; Clark, R. (2006): Other Satellite Navigation Systems. In E. D. Kaplan, C. Hegarty
(Eds.): Understanding GPS. Principles and applications. 2nd edition. Boston: Artech House,
pp. 595–634.
Franke, D.; Dzafic, D.; Baumeister, D.; Kowalewski, S. (2012): Energieeffizientes Routing für
Elektrorollstühle. In : 13. Aachener Kolloquium Mobilität und Stadt (AMUS/ACMOTE):
RWTH Aachen, pp. 65–68. Available online at http://publications.embedded.rwth-
aachen.de/file/51, checked on 7/21/2015.
Goodchild, M. F. (2007): Citizens as sensors: the world of volunteered geography. In GeoJournal
69 (4), pp. 211–221. DOI: 10.1007/s10708-007-9111-y.
Hahmann, S. (2014): Zur Beziehung von Raum und Inhalt nutzergenerierter geographischer
Informationen. Dissertation. Technische Universität Dresden, Dresden. Institut für Kar-
tographie.
Haining, R. P. (2003): Spatial data analysis. Theory and practice. Cambridge, UK, New York:
Cambridge University Press.
Haklay, M.; Weber, P. (2008): OpenStreetMap. User-Generated Street Maps. In IEEE Pervasive
Computing 7 (4), pp. 12–18. DOI: 10.1109/MPRV.2008.80.
Han, J.; Kamber, M. (2006): Data mining. Concepts and techniques. 2nd ed. Amsterdam, Boston,
San Francisco, CA: Elsevier; Morgan Kaufmann (The Morgan Kaufmann series in data
management systems).
Han, S.; Rizos, C. (1999): Road Slope Information from GPS-Derived Trajectory Data. In Journal
of Surveying Engineering 125 (2), pp. 59–68.
Harriehausen-Mühlbauer, B. (2014): Mobile Navigation for Limited Mobility Users. In D.
Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, A. Kobsa, F. Mattern et al. (Eds.): Digital
Human Modeling. Applications in Health, Safety, Ergonomics and Risk Management, vol.
8529. Cham: Springer International Publishing (Lecture Notes in Computer Science),
pp. 535–545.
6 Conclusion and Outlook
85
Heipke, C. (2010): Crowdsourcing geospatial data. In ISPRS Journal of Photogrammetry and
Remote Sensing 65 (6), pp. 550–557. DOI: 10.1016/j.isprsjprs.2010.06.005.
Hofmann-Wellenhof, B.; Lichtenegger, H.; Wasle, E. (2008): GNSS - Global Navigation Satellite
Systems. GPS, GLONASS, Galileo, and more. Wien, New York: Springer.
Jokar Arsanjani, J. (2014): Case study I: VGI platforms and data generalization. In D. Burghardt,
C. Duchêne, W. Mackaness (Eds.): Abstracting Geographic Information in a Data Rich
World. Cham: Springer International Publishing (Lecture Notes in Geoinformation and Car-
tography), pp. 131–138.
Karussel (2014): Digitalizing GPX Points or How to Track Vehicles With GraphHopper. Available
online at https://karussell.wordpress.com/2014/07/28/digitalizing-gpx-points-or-how-to-
track-vehicles-with-graphhopper/, updated on 7/28/2014, checked on 5/8/2015.
Kono, T.; Fushiki, T.; Asada, K.; Nakano, K. (2008): Fuel Consumption Analysis and Prediction
Model for “Eco” Route Search. In : 15th World Congress on Intelligent Transport Systems
and ITS America's 2008 Annual Meeting.
Kurihara, M.; Nonaka, H.; Yoshikawa, T. (2004): Use of highly accurate GPS in network-based
barrier free street map creation system. In : IEEE International Conference on Systems, Man
and Cybernetics. The Hague, Netherlands, Oct. 10-13, 2004, pp. 1169–1173.
Langley, R. B. (1999): Dilution of Precision. In GPS World (10 (5)), pp. 52–59.
Liu, G.; Hossain, K. M. A.; Iwai, M.; Ito, M.; Tobe, Y.; Sezaki, K.; Matekenya, D. (2014): Beyond
horizontal location context: Measuring Elevation Using Smartphone’s Barometer. In : Pro-
ceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous
Computing: Adjunct Publication. New York, USA, pp. 459–468.
Marchal, F.; Hackney, J.; Axhausen, K. (2005): Efficient Map Matching of Large Global Position-
ing System Data Sets: Tests on Speed-Monitoring Experiment in Zürich. In Transportation
Research Record 1935 (1), pp. 93–100. DOI: 10.3141/1935-11.
Massad, I.; Dalyot, S. (2015): Towards the production of digital terrain models from volunteered
GPS trajectories. In Survey Review. DOI: 10.1179/1752270615Y.0000000010.
Menkens, C.; Sussmann, J.; Al-Ali, M.; Breitsameter, E.; Frtunik, J.; Nendel, T.; Schneiderbauer,
T. (2011): EasyWheel - A Mobile Social Navigation and Support System for Wheelchair
Users. In : Eighth International Conference on Information Technology: New Generations
(ITNG). Las Vegas, NV, USA, pp. 859–866.
6 Conclusion and Outlook
86
Mennis, J.; Guo, D. (2009): Spatial data mining and geographic knowledge discovery—An
introduction. In Computers, Environment and Urban Systems 33 (6), pp. 403–408. DOI:
10.1016/j.compenvurbsys.2009.11.001.
Meyer, D. J. (2011): ASTER Global Digital Elevation Model Version 2 – Summary of Validation
Results. Available online at
https://www.jspacesystems.or.jp/ersdac/GDEM/ver2Validation/Summary_GDEM2_validati
on_report_final.pdf, checked on 1/15/2015.
Müller, A.; Neis, P.; Auer, M.; Zipf, A. (2010): Ein Routenplaner für Rollstuhlfahrer auf der Basis
von OpenStreetMap-Daten - Konzeption, Realisierung und Perspektiven. In J. Strobl, T.
Blaschke, G. Griesebner (Eds.): Angewandte Geoinformatik 2010. Beiträge zum 22. AGIT-
Symposium Salzburg. Berlin, Offenbach: Wichmann.
Neis, P.; Zielstra, D. (2014a): Generation of a tailored routing network for disabled people based
on collaboratively collected geodata. In Applied Geography 47, pp. 70–77. DOI:
10.1016/j.apgeog.2013.12.004.
Neis, P.; Zielstra, D. (2014b): Recent Developments and Future Trends in Volunteered Geographic
Information Research. The Case of OpenStreetMap. In Future Internet 6 (1), pp. 76–106.
DOI: 10.3390/fi6010076.
Neis, P.; Zielstra, D.; Zipf, A. (2012): The Street Network Evolution of Crowdsourced Maps.
OpenStreetMap in Germany 2007–2011. In Future Internet 4 (4), pp. 1–21. DOI:
10.3390/fi4010001.
Open Knowledge Foundation (2015): ODC Open Database License (ODbL) Summary. Available
online at http://opendatacommons.org/licenses/odbl/summary/, checked on 4/20/2015.
OpenStreetMap Foundation Wiki (2015a): About. Available online at
http://wiki.osmfoundation.org/w/index.php?title=About&oldid=3201, updated on 4/1/2015,
checked on 4/20/2015.
OpenStreetMap Foundation Wiki (2015b): License/We Are Changing The License. Available
online at
http://wiki.osmfoundation.org/w/index.php?title=License/We_Are_Changing_The_License
&oldid=1813, updated on 4/1/2015, checked on 4/20/2015.
OpenStreetMap Foundation Wiki (2015c): Working Groups. Available online at
http://wiki.osmfoundation.org/w/index.php?title=Working_Groups&oldid=2220, updated on
4/1/2015, checked on 4/20/2015.
6 Conclusion and Outlook
87
OpenStreetMap Wiki (2015a): Bing. Available online at
http://wiki.openstreetmap.org/w/index.php?title=Bing&oldid=1117458, updated on
4/16/2015, checked on 4/18/2015.
OpenStreetMap Wiki (2015b): Map Features. Available online at
http://wiki.openstreetmap.org/w/index.php?title=Map_Features&oldid=1178564, checked on
5/27/2015.
OpenStreetMap Wiki (2015c): Stats. Available online at
http://wiki.openstreetmap.org/w/index.php?title=Stats&oldid=1145799, updated on
4/9/2015, checked on 4/15/2015.
OpenStreetMap Wiki (2015d): User:Ikonor/DE:SRTM Alternativen / DGM – OpenStreetMap
Wiki. Edited by OpenStreetMap Wiki. Available online at
http://wiki.openstreetmap.org/w/index.php?title=User:Ikonor/DE:SRTM_Alternativen_/_DG
M&oldid=1160583, checked on 7/11/2015.
OpenStreetMap Wiki (2015e): Key:incline. Available online at
http://wiki.openstreetmap.org/w/index.php?title=Key:incline&oldid=1148320, checked on
5/27/2015.
Quddus, M. A.; Ochieng, W. Y.; Noland, R. B. (2007): Current map-matching algorithms for
transport applications: State-of-the art and future research directions. In Transportation Re-
search Part C: Emerging Technologies 15 (5), pp. 312–328. DOI: 10.1016/j.trc.2007.05.002.
Ramm, F.; Topf, J. (2010): OpenStreetMap. Die freie Weltkarte nutzen und mitgestalten.
3. Auflage. Berlin: Lehmanns Media.
Ramm, F.; Topf, J.; Chilton, S. (2011): OpenstreetMap. Using and enhancing the free map of the
world. English ed. Cambridge, England: UIT Cambridge.
Resch, B. (2013): People as Sensors and Collective Sensing - Contextual Observations Comple-
menting Geo-Sensor Network Measurements. In J. M. Krisp (Ed.): Progress in location-
based services. Heidelberg, New York: Springer (Lecture Notes in Geoinformation and Car-
tography), pp. 391–406.
Sachenbacher, M.; Leucker, M.; Artmeier, A.; Haselmayr, J. (2011): Efficient Energy-Optimal
Routing for Electric Vehicles. In : Proceedings of the Twenty-Fifth AAAI Conference on
Artificial Intelligence and the Twenty-Third Innovative Applications of Artificial Intelli-
gence Conference, 7-11 August 2011, San Francisco, California, USA. Menlo Park, Calif.:
AAAI Press, pp. 1402–1407.
6 Conclusion and Outlook
88
Santerre, R.; Pan, L.; Cai, C.; Zhu, J. (2014): Single Point Positioning Using GPS, GLONASS and
BeiDou Satellites. In Positioning 05 (04), pp. 107–114. DOI: 10.4236/pos.2014.54013.
Sester, M.; Jokar Arsanjani, J.; Klammer, R.; Burghardt, D.; Haunert, J.-H. (2014): Integrating and
Generalising Volunteered Geographic Information. In D. Burghardt, C. Duchêne, W.
Mackaness (Eds.): Abstracting Geographic Information in a Data Rich World. Cham:
Springer International Publishing (Lecture Notes in Geoinformation and Cartography),
pp. 119–155.
Shekhar, S.; Zhang, P.; Huang, Y.; Vatsavai, R. R. (2004): Trends in Spatial Data Mining. In H.
Kargupta (Ed.): Data mining. Next generation challenges and future directions. Menlo Park,
Calif., London, Cambridge, Mass.: AAAI Press; Copublished and distributed by MIT Press.
Sui, D. Z. (2008): The wikification of GIS and its consequences: Or Angelina Jolie’s new tattoo
and the future of GIS. In Computers, Environment and Urban Systems 32 (1), pp. 1–5. DOI:
10.1016/j.compenvurbsys.2007.12.001.
Torge, W. (2001): Geodesy. 3rd completely rev. and extended ed. Berlin, New York: W. de
Gruyter.
Tukey, J. W. (1977): Exploratory data analysis. Reading, Mass.: Addison-Wesley Pub. Co
(Addison-Wesley series in behavioral science).
van Winden, K. (2014): Automatically Deriving and Updating Attribute Road Data from Move-
ment Trajectories. Master's Thesis. Delft University of Technology.
Völkel, T.; Weber, G. (2008): RouteCheckr. In S. Harper, A. Barreto (Eds.): the 10th international
ACM SIGACCESS conference. Halifax, Nova Scotia, Canada, p. 185.
Zhang, L.; Thiemann, F.; Sester, M. (2010): Integration of GPS traces with road map. In :
Computational Transportation Science, pp. 17–22.
Zhilin, L.; Qing, Z.; Gold, C. (2005): Digital terrain modeling Principles and methodology. New
York, USA: CRC-Press.