microsoft research faculty summit 2007. john krumm microsoft research redmond, wa
TRANSCRIPT
Microsoft Research Faculty Summit 2007
John KrummMicrosoft ResearchRedmond, WA
55 GPS receivers241 subjects1.97 million points106,000 miles171,000 kilometers13,845 tripsHome addresses and demographic data
Greater Seattle Seattle Downtown Close-up
Garmin Geko 201$11510,000 point memoryMedian recording interval
6 seconds
63 meters
Destination ModelingPredestination – Destination predictionSnap-to-Road – Map matching with temporal constraintsPersonalized RoutesLocation Privacy
Destinations of drivers in our location survey
John Krumm and Eric Horvitz, "Driver Destination Models", Eleventh International Conference on User Modeling (UM 2007), June 25-27, 2007, Corfu, Greece.
U.S. Geological Survey – Seattle Area
0 0.1 0.2 0.3 0.4
commerciallow intensity residential
evergreen forestdeciduous forest
shrublandmixed forest
grasslandswater
pasturequarry
transitionalhigh intensity residential
urbanfallow
bare rockrow crops
small grainsperennial ice
orchardwoody wetlands
emergent herbacous …
Normalized Frequency
Destination Frequency Versus Ground Cover
What are the most attractive kinds of ground cover?
Destinations vs. Time of Day
0
0.5
1
1.5
2
2.5
3
Hour (24 hour clock)
Mean
Desti
nati
on
s p
er
Week
Previous Destinations
New Destinations
Destinations vs. Time of Day
0
0.5
1
1.5
2
2.5
3
Hour (24 hour clock)
Mean
Desti
nati
on
s p
er
Week
Previous Destinations
New Destinations
Probability of New Destination vs. Time of Day
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Hour (24 hour clock)
Pro
bab
ilit
y o
f N
ew
Desti
nati
on
Probability of New Destination vs. Time of Day
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Hour (24 hour clock)
Pro
bab
ilit
y o
f N
ew
Desti
nati
on
Time of Day
Destinations vs. Day of Week
0
1
2
3
4
5
6
Me
an
De
sti
na
tio
ns
pe
r W
ee
k
Previous Destinations
New Destinations
Destinations vs. Day of Week
0
1
2
3
4
5
6
Me
an
De
sti
na
tio
ns
pe
r W
ee
k
Previous Destinations
New Destinations
Probability of New Destination vs. Day of Week
0
0.10.2
0.30.4
0.5
0.60.7
0.80.9
1
Pro
bab
liit
y o
f N
ew
Desti
nati
on
Probability of New Destination vs. Day of Week
0
0.10.2
0.30.4
0.5
0.60.7
0.80.9
1
Pro
bab
liit
y o
f N
ew
Desti
nati
on
Day of Week
Rate of Decline versus DemographicsSingle versus partner – no significant differenceChildren versus no children – no significant differenceExtended family nearby versus not – no significant differenceGender – women decline faster than men
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7 8 9 10 11 12
New
Des
tinat
ions
Vis
ited
Days Into Survey
New Destinations Drivers reach steady state after about two weeks
Destination ModelingPredestination – Destination predictionSnap-to-Road – Map matching with temporal constraintsPersonalized RoutesLocation Privacy
John Krumm and Eric Horvitz, "Predestination: Inferring Destinations from Partial Trajectories", Eighth International Conference on Ubiquitous Computing (UbiComp 2006), September 2006.
Anticipatory informationLocation-based advertisingHybrid vehicle efficiency
Traffic WarningDestination Safeco Field (54%
chance): 15-minute delay at I-405 & I-90. Suggest I-5 instead.Destination Seattle Center (31% chance): Broad St. closed. Suggest Denny Way instead.
Going to the airport? Park with us for $8/day!
Greater Seattle, ~ 40 km X 40 km
1 km grid
0 0.1 0.2 0.3 0.4
commerciallow intensity …
evergreen forestdeciduous forest
shrublandmixed forest
grasslandswater
pasturequarry
transitionalhigh intensity …
urbanfallow
bare rockrow crops
small grainsperennial ice
orchardwoody wetlands
emergent …
Normalized Frequency
Destination Frequency Versus Ground Cover
Ground Cover Prior
U.S. Geological Survey – Seattle Area
All Possible Destinations Destinations of One Subject
Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7
Day 8 Day 9 Day 10 Day 11 Day 12 Day 13 Day 14
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Pro
bab
ility
Fac
tor
Day of Survey
Open-World Mixing Probabilities
Wedding Cakes (α)
Background (β)
Closed-World (1-α-β)
Personal destinations = visited cells + clustering + sparkling
start
CurrentLocation
Candidate Destination
R
r
Δt
t
rRe
0
0.05
0.1
0.15
0.2
0.25
0-4 5-9 10-14 15-19 20-24 25-29 30-34 35-39 > 39
Norm
aliz
ed F
requen
cy
Trip Time (minutes)
Trip Time Distribution
From 2001 U.S. National Household Transportation Survey
N
jopenSSTE
openSSTESS
jDPjDtTPjDeEP
iDPiDtTPiDeEPtTeEiDP
1
,
Efficient driving likelihood: iDeEPE
Trip time likelihood: iDtTP SST
Open-world prior:
Final probability:
iDPiDWiDPiDP Gclosedopen 1
Closed-world prior:
iDPG
iDPclosed
Wedding cakes: iDW
Ground cover:
Half of trips (3667) for training efficiency distributionsRemaining half for testingLeave-one-out for personal destinations prior
0
1000
2000
3000
4000
5000
6000
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Med
ian P
redic
tion E
rror
(met
ers)
Trip Fraction
Prediction Error Versus Trip Fraction
Complete data modelOpen-world modelSimple closed-world model
Destination ModelingPredestination – Destination predictionSnap-to-Road – Map matching with temporal constraintsPersonalized RoutesLocation Privacy
Congestion Pricing Location Based ServicesPay As You Drive (PAYD) Insurance
Collaborative Traffic Probes (DASH) Research (London OpenStreetMap)
John Krumm, "Inference Attacks on Location Tracks", Fifth International Conference on Pervasive Computing
(Pervasive 2007), May 13-16, 2007, Toronto, Ontario, Canada.
Last Destination – median of last destination before 3 a.m.
Median error = 60.7 meters
Weighted Median – median of all points, weighted by time spent at point (no trip segmentation required)
Median error = 66.6 meters
Largest Cluster – cluster points, take median of cluster with most points
Median error = 66.6 meters
Best Time – location at time with maximum probability of being home
Median error = 2390.2 meters (!)
Relative Probability of Home vs. Time of Day
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
00:00
01:00
02:00
03:00
04:00
05:00
06:00
07:00
08:00
09:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
Time (24 hour clock)
Pro
bab
ilit
y
8 a.m. 6 p.m.
Relative Probability of Home vs. Time of Day
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
00:00
01:00
02:00
03:00
04:00
05:00
06:00
07:00
08:00
09:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
Time (24 hour clock)
Pro
bab
ilit
y
8 a.m. 6 p.m.
GPS interval – 6 seconds and 63 metersGPS satellite acquisition – ≈45 seconds on cold start, time to drive 300 meters at 15 mphCovered parking – no GPS signalDistant parking – far from home
Covered Parking Distant Parking
Windows Live Search reverse white pages lookup(free API at http://dev.live.com/livesearch/)
MapPoint Web Service reverse
geocoding
Windows Live Search
reverse white pages
Original σ= 50 meters noise added
Effect of added noise on address-finding rate
Original Snap to 50 meter grid
Effect of discretization on address-finding rate
1. Pick a random circle center within “r” meters of home
2. Delete all points in circle withradius “R”
r
actual home
location
R
random point in
small circle
data inside large circle
deleted
© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.