aws webcast - location based search
DESCRIPTION
TRANSCRIPT
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Building Location-Based Search with
Amazon CloudSearch
Tom Hill April 3, 2013
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Agenda
! What is CloudSearch ! What is Location-Based search? ! Computing distance ! Location-Based Search in CloudSearch ! Sample App
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Housekeeping
! Type questions into GoToMeeting window • We'll get to them at the end.
! Recording will be on the AWS YouTube channel ! Slides will be on Slideshare.net
• A link to slides will be mailed to all participants
! You can maximize the shared screen's window
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
What is CloudSearch?
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
What is CloudSearch
! Amazon CloudSearch is a fully-managed search service ! Easy to add search functionality to your application ! Fast and highly scalable ! Supports search features like
• Faceting • Synonyms, stopwords • Ranking
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
How is CloudSearch Used
! Create/Configure a "search domain" ! Post documents via HTTP ! Search via HTTP
• results as XML or JSON
! Scales automatically
http://search-‐DOMAIN-‐XYZ.REGION.cloudsearch.amazonaws.com/2011-‐02-‐01/search?q=cat
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
More Information
! Webinar • 'Getting Started With Amazon CloudSearch"
! April 17, 9:00am PT / 12:00pm ET ! Register at http://bit.ly/11WZRAZ
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
What is Location Based Search?
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
What is Location-Based Search?
! Using location & distance as factors in search
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Why Do You Care?
! People care about things near them. • Pizza, Classified Ads, etc. • Find a Doctor, Lawyer,…
! Mobile is a key driver
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
How do we use distance?
! Limit to an Area • Box • Circle
! Sort by Distance ! Include distance in ranking
• combine text relevance and distance
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Computing Distance
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Computing Distance
! Many Formulas • Rectangular distance • Equi-rectangular projection • Spherical Law of Cosines • Haversine Formula
! Speed Vs. Accuracy • Speed: Rectangular distance • Accuracy: Haversine Formula
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Why So Many Ways to Compute Distance?
The earth isn't flat! It isn't a sphere either.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Is the Earth Flat?
! If it's flat
! If it's not flat
(x − x2)2 + (y− y2)2
2r× arcsin sin2 φ2 −φ12
#
$%
&
'(+ cos(φ1)cos(φ2 )sin
2 λ2 −λ12
#
$%
&
'(
#
$%%
&
'((
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
How Much Difference?
! Distances between some cities • Fort Lauderdale, FL to New Haven, CT
• Haversine: 1813 • Rectangular: 1872 • Difference: 3%
• Bangor, ME to Adak, AK • Haversine: 7244 • Rectangular: 12016 • Difference: 66%
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Difference in Detail
Haversine Cosines EuqiRect Rect - CosErr EquErr RecErr 1812.54997 1812.54997 1814.38516 1871.77660 - 0.000 0.001 0.033 Fort Lauderdale, FL to New Haven, CT7244.45661 7244.45661 8008.74698 12015.96646 - 0.000 0.106 0.659 Bangor, ME to Adak,AK
! Four computations • Rectangular Distance • EquiRectangular Projection • Spherical Law of Cosines • Haversine
Same Close Not so close
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
How Much Accuracy?
! "Pizza, 1 Mile" ! Haversine is more accurate
• If you are a bird
! Any distance is approximate
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Which one is the Right Choice?
! As usual: "It depends" ! Factors
• Desired query speed • Index size • Accuracy needed
! Start with Equirectangular Projection • Test
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Location-Based Search in CloudSearch
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
How to Compute Distance in CloudSearch?
! Rank Expressions • Computations run for each matching document
! Can be used for • Sorting • Influencing Scoring
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Two Types of Rank Expressions
! Static Rank Expressions • Computation based on values in index
! Query Time Rank Expressions • Allow including parameters at run time.
• e.g. latitude, longitude.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Converting to Rank Expression
! Rank expressions have JavaScript-like syntax ! Most math functions
• log2, log10,sin, cos, atan, min, max, etc.
! So this distance formula:
! Becomes • sqrt(pow(lat-userlat),2)+pow(lon-userlon),2))
(x − x2)2 + (y− y2)2
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Query Time Rank Expressions
! Define Rank Expression • &rank-NAME=EXPRESSION • &rank-geo=sqrt(pow(lat-userlat),2)+pow(lon-userlon),2))
! Select Rank Expression • &rank=NAME • &rank=geo
http://searchendpoint?q=creek&rank=geo&rank-geo=sqrt(pow(la1-123),2)+pow(lo1-456),2))
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Influencing Scoring
! Include both text relevance and distance:
&rank-geo=(1000-text_relevance) + sqrt(pow(la1-ulat),2)+pow(lo1-ulon),2))
! Relevance: Higher is better ! Distance: Lower is better
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Combining Relevance and Distance
! Which is more important distance or relevance? • By how much?
! Relative Weight • N * text_relevance + M * distance
! That's where the art comes in • Will vary by your application • Test & tune, and test again • Rank expression comparator
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Compare Rank Expressions
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Storing Data
! CloudSearch supports unsigned integers ! Have to convert latitude, longitude to positive ranges
• latitude + 90 • longitude + 180
! Have to store as integers; need to scale • latitude = (latitude + 90) * 1000 • longitude= (longitude+ 180) * 1000
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Performance
! Don't query the whole world • Can limit by literals or numeric fields. • Literals are more efficient for limits.
! Limit Options • Literal
• &bq=state:'CA' • &bq=zip:'94402'
• Numeric • &bq=(and latitude:40..50 longitude:80..85)
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Performance Measures
GeoMethod TextRel Limits Queries Seconds QTimeMS Threads CompletedQ AveHits NONE false 10 6.2255 622 1 10 8345450.00 CARTESIAN false 10 15.6064 1560 1 10 8345450.00 EQUI false 10 19.7106 1971 1 10 8345450.00 COSINES false 10 27.4968 2749 1 10 8345450.00 HAVERSINE false 10 31.2595 3125 1 10 8345450.00 NONE false Numeric 10 9.1758 917 1 10 3807.00 CARTESIAN false Numeric 10 9.0255 902 1 10 3807.00 EQUI false Numeric 10 9.1158 911 1 10 3807.00 COSINES false Numeric 10 9.8321 983 1 10 3807.00 HAVERSINE false Numeric 10 9.1272 912 1 10 3807.00 NONE false literal 10 0.8254 82 1 10 3781.00 CARTESIAN false literal 10 0.5936 59 1 10 3781.00 EQUI false literal 10 0.6173 61 1 10 3781.00 COSINES false literal 10 0.5916 59 1 10 3781.00 HAVERSINE false literal 10 0.6289 62 1 10 3781.00
Why you don't want query the whole world!
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Geo-Spatial Demo Application
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Demo Application Structure
HTML Page
Javascript Server (Tomcat)
CloudSearch Ajax
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Demo Implementation
! JavaScript • Ajax • JQuery • Google Maps API
! Tomcat Server • Java • Just for forwarding of requests
• Because XSS, that's why.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Querying CloudSearch $.ajax({
url : "searchx", data : { 'q' : currentQuery, 'domain' : "geoname25", 'return-‐fields' : returnFields.join(), "rank" : "geo", "rank-‐geo" : "Math.sqrt(Math.pow(Math.abs(doc.latitude_90-‐ 12539),2) +
Math.pow(Math.abs(doc.longitude_180-‐ 5784),2))" }, dataType : "json", success : function(data) { var hits = data['hits']; displaySearchResults(hits['hit'], hits['found']); populateMap(hits['hit']); }
});
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Getting Latitude & Longitude
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Getting Latitude & Longitude
! What if you don't know the latitude & longitude? ! GeoCoding Services
• Many services • Some free tiers • May have restrictions
• Google, Bing, Yahoo, MapQuest, ArcGis, …
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Wrap Up
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Recap
! Local search is a component of many applications ! CloudSearch supports local search
• Using rank expressions
! GeoCoding services
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Thanks for Coming!
! Data • http://www.geonames.org/export/
! Slides • Slideshare.net soon. We'll send you a link.
! Sample Code • Talk to me. ([email protected])
! Computations • http://www.movable-type.co.uk/scripts/latlong.html • wikipedia
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Any Questions?
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Thanks for Coming!
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Use Zip Code, City, State as a proxy
! Multiple ways to select, map, zip, current location
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Radians vs Degrees
! Should note somewhere that all but rectangular distance require radians.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Querying Data ! Java/Javascript
• Fields are latitude_90, longitude_180 • user location is userlat, userlon
! simple distance
rank = "Math.sqrt(Math.pow(Math.abs(latitude_90-‐(" + userlat + ")),2)+Math.pow(Math.abs(longitude_180-‐(" + userlon + ")),2))";
! Spherical Law of Cosines
rank = "6371*Math.acos(Math.sin(" + userlat + ") * Math.sin(lat_rad/" + scale + ") + Math.cos(" + userlat + ") * Math.cos(lat_rad/" + scale + ") * Math.cos((lon_rad/" + scale + ") - " + userlon + ") )";
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
What CloudSearch Doesn't Do
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.
Issues with assuming the earth is flat.