where next

1. Anna Monreale Fabio Pinelli Roberto TrasartiFosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti.WhereNext: a Location Predictor on Trajectory Pattern Mining . KDD 2009 Knowledge Discovery and Delivery Lab (ISTI-CNR&Univ. Pisa) www-kdd.isti.cnr.it

Wireless networks infrastructures are thenerves of our territory

besides offering their services, they gather highly informativetracesabout the human mobile activities

Miniaturization, wearability, pervasiveness will produce traces of increasing

positioning accuracy

semantic richness

From the analysis of the traces of our mobile phones it is possible to reconstruct our mobile behaviour, the way we collectively move

This knowledge may help us improving decision-making in many mobility-related issues:

Planning traffic and public mobility systems in metropolitan areas;

Planning physical communication networks

Forecasting traffic-related phenomena

Organizing logistics systems

Prediction

Predicting the next location of a trajectory can improve a large set of services such as:

Navigational services.

Trac management.

Location-based advertising.

Services Pre-fetching.

Simulation.

How to realize this idea:

Extract patterns fromall theavailable movementsin a certain area instead of on the individual history of an object;

Using theseLocal movement patternsas predictive rules.

Build a prediction tree as global model.

The local pattern we use is theT-Pattern.It describes the common behavior of a group of users in space and time.

Generatingall rulesfrom each T-pattern and using them to build a classifier is too expensive.

To avoid the rules generation the T-Pattern set is organized as a prefix tree.

For Each nodev Ididenties the nodev

Regiona spatial component of the T-Pattern

Supportis the support of the T-pattern

For Each edgej

[a,b]correspond to the time interval nof the T-Pattern

Three steps:

Search for best match

Candidate generation

Make predictions

The spatio-temporal distance computed between the segment of trajectory (bounded in time using the previous transition time) and the current node of the path.

The path score is the aggregation of all punctual scores along a path.

TheBest Matchis the path having:

the maximum path score;

at least one admissible prediction.

Averagegeneralizes distances between the trajectory and each node

Sumis based on the concept of depth

Maxis the optimistic one, the best punctual score is selected as path score

Context-dependentaggregations can take into consideration other aspects of the problem.

The WhereNext algorithm can be tuned using its parameters: -th_t: time window tolerance

-th_s : space window tolerance

-th_score : minimum prediction score threshold

-th_agg : the aggregation function used to compute the path score (Avg, Sum or Max)

It is very hard to understand which is the best set ofT-patterns we can use to build the our model:

a big set ofT-patternsvery slow prediction.

a small set of T-patternscoverage leaks

For this reason we have defined a way to measure the prediction power of a T-Pattern set.

An evaluating function is defined to estimate thepredicting powerof a T-Pattern set.

SpatialCoverage : the space coverage of the regions contained in the T-Patterns set;

DatasetCoverage : measures how much the T-Pattern set represents the trajectories

RegionSeparation : the precision of the regions in the T-Pattern set.

The results are evaluated using the following measures:

Accuracy : rate of the correctly predicted locations (space and time) divided by the total number of trajectories to be predicted.

Average Error : the average distance between the real trajectories in the predicted interval and the region predicted.

Prediction rate : the number of trajectories which have a prediction divided by the total number of trajectories to be predicted.

We used real life GPS dataset obtained from 17,000 vehicles in the urban area of the city of Milan.

Predictedvsth_score

AccuracyvsAverage Error

A visual example of the application on Milan mobility data. The context is traffic management and we want to predict how the traffic will move in the city center.

We have built a predictor on a good set ofT-patterns whichincludethe city gates of Milan.

- Anew techniqueto predict the next locations of a trajectory based on previous movements of all the objects without considering any information about the users. - Thetime informationis used not only to order the events but is intrinsically equipped in the T-Patterns used to build the Prediction tree. - The user cantune the methodto obtain a good accuracy and prediction rate.

- We are experimenting the methodin real worldapplications.

The same exact spatial location (x,y) usually never occurs twice

The same exact transition times usually do not occur twice

Solution: allow approximation

a notion ofspatial neighborhood

a notion oftemporal tolerance

Two points match if one falls within aspatial neighborhood N()of the other

Two transition times match if theirtemporal difference is

Example:

T-pattern mining can be mapped to a density estimation problem over R 3n-1

2 dimensions for each (x,y) in the pattern (2n)

1 dimension for each transition (n-1)

Density computed by

mapping each sub-sequence of n points of each input trajectory toR 3n-1

drawing an influence area for each point (composition ofN()and )

Too computationally expensive, heuristics needed

Our solution: a combination of sequential pattern mining and density-based clustering

where next

Documents

tpattern set

tpattern rules

tpattern supportis

small set of t

best set

trajectory pattern mining

big set

training set