where next
DESCRIPTION
TRANSCRIPT
- 1. Anna Monreale Fabio Pinelli Roberto TrasartiFosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti.WhereNext: a Location Predictor on Trajectory Pattern Mining . KDD 2009 Knowledge Discovery and Delivery Lab (ISTI-CNR&Univ. Pisa) www-kdd.isti.cnr.it
2.
- Wireless networks infrastructures are thenerves of our territory
- besides offering their services, they gather highly informativetracesabout the human mobile activities
- Miniaturization, wearability, pervasiveness will produce traces of increasing
-
- positioning accuracy
-
- semantic richness
3.
- From the analysis of the traces of our mobile phones it is possible to reconstruct our mobile behaviour, the way we collectively move
- This knowledge may help us improving decision-making in many mobility-related issues:
-
- Planning traffic and public mobility systems in metropolitan areas;
-
- Planning physical communication networks
-
- Forecasting traffic-related phenomena
-
- Organizing logistics systems
-
- Prediction
4. 5.
- Predicting the next location of a trajectory can improve a large set of services such as:
- Navigational services.
- Trac management.
- Location-based advertising.
- Services Pre-fetching.
- Simulation.
? ? ? .4 .8 .35 6.
- How to realize this idea:
- Extract patterns fromall theavailable movementsin a certain area instead of on the individual history of an object;
- Using theseLocal movement patternsas predictive rules.
- Build a prediction tree as global model.
Trajectory dataset Local patterns Prediction Tree 7. Select the set of interesting trajectories Validation Evaluation Extract T-Patterns (A set of Local models) Merge T-Patterns (Global model) Use the Condensed model as predictor 8.
- The local pattern we use is theT-Pattern.It describes the common behavior of a group of users in space and time.
F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi.Trajectory pattern mining . KDD 2007: 330-339. 9.
- Generatingall rulesfrom each T-pattern and using them to build a classifier is too expensive.
T-Pattern Rules 1 2 3 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4 10.
- To avoid the rules generation the T-Pattern set is organized as a prefix tree.
- For Each nodev Ididenties the nodev
- Regiona spatial component of the T-Pattern
- Supportis the support of the T-pattern
- For Each edgej
- [a,b]correspond to the time interval nof the T-Pattern
11.
- Three steps:
-
- Search for best match
-
- Candidate generation
-
- Make predictions
How to compute the Best Match? Best Match Prediction 12.
- The spatio-temporal distance computed between the segment of trajectory (bounded in time using the previous transition time) and the current node of the path.
Case a : The trajectory segment intersects the region of the node Case b : The enlarged trajectory segment intersects the region Case c : The enlarged trajectory segment doesnt intersect the region Wheretheth_tis the time tolerance window defined by the user. 13.
- The path score is the aggregation of all punctual scores along a path.
- TheBest Matchis the path having:
-
- the maximum path score;
-
- at least one admissible prediction.
10 min 15 min 8 min 10 min Punctual score: 1 Punctual Score: .58 Punctual Score: .8 11 min 16 min Path score .79 14.
- Averagegeneralizes distances between the trajectory and each node
- Sumis based on the concept of depth
- Maxis the optimistic one, the best punctual score is selected as path score
- Context-dependentaggregations can take into consideration other aspects of the problem.
15.
- The WhereNext algorithm can be tuned using its parameters: -th_t: time window tolerance
- -th_s : space window tolerance
- -th_score : minimum prediction score threshold
- -th_agg : the aggregation function used to compute the path score (Avg, Sum or Max)
16.
- It is very hard to understand which is the best set ofT-patterns we can use to build the our model:
- a big set ofT-patternsvery slow prediction.
- a small set of T-patternscoverage leaks
- For this reason we have defined a way to measure the prediction power of a T-Pattern set.
17.
- An evaluating function is defined to estimate thepredicting powerof a T-Pattern set.
- SpatialCoverage : the space coverage of the regions contained in the T-Patterns set;
- DatasetCoverage : measures how much the T-Pattern set represents the trajectories
- RegionSeparation : the precision of the regions in the T-Pattern set.
Model 1 Model 2 Testing the a priori evaluation 18. You are here 19.
- The results are evaluated using the following measures:
- Accuracy : rate of the correctly predicted locations (space and time) divided by the total number of trajectories to be predicted.
- Average Error : the average distance between the real trajectories in the predicted interval and the region predicted.
- Prediction rate : the number of trajectories which have a prediction divided by the total number of trajectories to be predicted.
Predicted Location Cut Original Predicted Location Cut Original Error 20.
- We used real life GPS dataset obtained from 17,000 vehicles in the urban area of the city of Milan.
Training set : 4000 trajectories between 7am and 10 am on WednesdayTest set : 500 trajectories between 7am and 10 am on Thursday. 21.
- Predictedvsth_score
Average Errorvsth_space 22.
- AccuracyvsAverage Error
Single UsersAccuracyandPrediction rate 23.
- A visual example of the application on Milan mobility data. The context is traffic management and we want to predict how the traffic will move in the city center.
- We have built a predictor on a good set ofT-patterns whichincludethe city gates of Milan.
Part of the GeoPKDD integrated platform.F. Giannotti, D. Pedreschi, and et al. Geopkdd:Geographic privacy-aware knowledge discovery and delivery(european project), 2008. 24.
- - Anew techniqueto predict the next locations of a trajectory based on previous movements of all the objects without considering any information about the users. - Thetime informationis used not only to order the events but is intrinsically equipped in the T-Patterns used to build the Prediction tree. - The user cantune the methodto obtain a good accuracy and prediction rate.
- - We are experimenting the methodin real worldapplications.
25. 26. Trajectories Dataset Regions of Interest T-PATTERNS 27. 28.
- The same exact spatial location (x,y) usually never occurs twice
- The same exact transition times usually do not occur twice
- Solution: allow approximation
-
- a notion ofspatial neighborhood
-
- a notion oftemporal tolerance
29.
- Two points match if one falls within aspatial neighborhood N()of the other
- Two transition times match if theirtemporal difference is
- Example:
30.
- Two points match if one falls within aspatial neighborhood N()of the other
- Two transition times match if theirtemporal difference is
- Example:
31.
- Two points match if one falls within aspatial neighborhood N()of the other
- Two transition times match if theirtemporal difference is
- Example:
32.
- T-pattern mining can be mapped to a density estimation problem over R 3n-1
-
- 2 dimensions for each (x,y) in the pattern (2n)
-
- 1 dimension for each transition (n-1)
- Density computed by
-
- mapping each sub-sequence of n points of each input trajectory toR 3n-1
-
- drawing an influence area for each point (composition ofN()and )
- Too computationally expensive, heuristics needed
- Our solution: a combination of sequential pattern mining and density-based clustering