geospatial search with amazon cloudsearch

30
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc. GeoSpatial Search in Amazon CloudSearch Tom Hill January 30, 2013

Upload: michael-bohlig

Post on 15-Jun-2015

3.643 views

Category:

Technology


2 download

DESCRIPTION

Presented by Tom Hill, Amazon CloudSearch Solution Architect, at the LA Amazon CloudSearch User Meetuo, this talk covers using location & distance as factors in search. Techniques for measuring proximity are discussed as well as performance considerations.

TRANSCRIPT

Page 1: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

GeoSpatial Search in

Amazon CloudSearch

Tom Hill January 30, 2013

Page 2: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Agenda

!   What is GeoSpatial search? !   Why do we care? !   Computing distance !   Geospatial Search in CloudSearch

Page 3: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

What is GeoSpatial Search?

!   Using location & distance as factors in search

Page 4: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

What's this "Geo", anyway?

!   Geographic •  On the earth

!   Spatial •  Simple distance

Page 5: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

How do we use distance?

!   Limit to an Area •  Box •  Circle •  Polygon*

!   Sort by Distance !   Include distance in score

*not yet!

Page 6: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Why Do You Care?

!   People care about things near them. •  Pizza, Classified Ads, etc. •  Find a Doctor, Lawyer,…

!   Mobile is a key driver

Page 7: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

What can you do with CloudSearch?

Page 8: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Computing Distance

!   Many Formulas •  Rectangular distance •  Equirectangular projection •  Spherical Law of Cosines •  Haversine Formula •  Vincenty's Formula

!   Speed Vs. Accuracy •  Speed: Rectangular distance •  Accuracy: Vincenty's Formula

Page 9: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Why So Many Ways to Compute Distance?

The earth isn't flat! It isn't a sphere either.

Page 10: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Is the Earth Flat?

!   If it's flat •  distance = sqrt((lat1-lat2)^2 + (lon1-lon2)^2)

!   If it's not

Page 11: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

How Much Accuracy?

!   "Pizza, 1 Mile" ! Haversine is more accurate

•  If you are a bird

!   Any distance is approximate

Page 12: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Comparing Distance Computations

!   Four computations: Haversine, Cosines, EquiRrect, Rect. !   Different computations – Different Results !   How accurate do you need to be?

Haversine      Cosines     EuqiRect         Rect  -  CosErr EquErr RecErr  994.79893    994.79893    995.25921   1044.40926  -   0.000  0.000  0.050  Fort Lauderdale, FL to Anniston, AL 624.04339   3624.04339   3642.98321   4163.41737  -   0.000  0.005  0.149  Fort Lauderdale, FL to San Diego, CA1812.54997   1812.54997   1814.38516   1871.77660  -   0.000  0.001  0.033  Fort Lauderdale, FL to New Haven, CT8175.93897   8175.93897   8817.96563  11107.90729  -   0.000  0.079  0.359  Fort Lauderdale, FL to Adak, AK7244.45661   7244.45661   8008.74698  12015.96646  -   0.000  0.106  0.659  Bangor, ME to Adak,AK

Page 13: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

GeoSpatial Search in

Amazon CloudSearch

Page 14: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

How to Compute Distance in CloudSearch?

!   Rank Expressions •  Computations run for each matching document

!   Can be used for •  Sorting •  Influencing Scoring

Page 15: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Two Types of Rank Expressions

!   Static Rank Expressions •  Computation based on values in index

!   Query Time Rank Expressions •  Allow including parameters at run time.

•  e.g. latitude, longitude.

Page 16: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Query Time Rank Expressions

!   Define Rank Expression •  &rank-NAME=EXPRESSION •  &rank-geo=sqrt(pow(lat-userlat),2)+pow(lon-userlon),2))

!   Select Rank Expression •  &rank=NAME •  &rank=geo

http://searchendpoint?q=creek&rank=geo&rank-geo=sqrt(pow(la1-123),2)+pow(lo1-456),2))

Page 17: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Influencing Scoring

!   Include both text relevance and distance:

&rank-geo=(1000-text_relevance) + sqrt(pow(la1-ulat),2)+pow(lo1-ulon),2))

!   Relative Weight •  N * text_relevance + M * distance •  That's where the art comes in. •  Will vary by your application. •  Test & tune, and test again.

Page 18: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Storing Data

!   CloudSearch supports unsigned integers !   Have to convert latitude, longitude to positive ranges

•  latitude + 90 •  longitude + 180

!   Have to store as integers; need to scale •  latitude = (latitude + 90) * 100 •  longitude= (longitude+ 180) * 100

Page 19: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Performance

!   Don't query the whole world •  Can limit by literals or numeric fields. •  Literals are more efficient for limits.

!   Limit Options •  Literal

•  &bq=state:'CA' •  &bq=zip:'94402'

•  Numeric •  &bq=(and latitude:40..50 longitude:80..85)

•  Geohash

Page 20: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Performance Measures

!GeoMethod!!!!TextRel!!!!!Limits!!!!Queries!!Seconds!!QTimeMS!!!!Threads!CompletedQ!!!!!!AveHits!!!!!!!NONE!!!!!!false!!!!!!!!!!!!!!!!!!!!10!!!6.2255!!!!!!622!!!!!!!!!!1!!!!!!!!!10!!!8345450.00!!CARTESIAN!!!!!!false!!!!!!!!!!!!!!!!!!!!10!!15.6064!!!!!1560!!!!!!!!!!1!!!!!!!!!10!!!8345450.00!!!!!!!EQUI!!!!!!false!!!!!!!!!!!!!!!!!!!!10!!19.7106!!!!!1971!!!!!!!!!!1!!!!!!!!!10!!!8345450.00!!!!COSINES!!!!!!false!!!!!!!!!!!!!!!!!!!!10!!27.4968!!!!!2749!!!!!!!!!!1!!!!!!!!!10!!!8345450.00!!HAVERSINE!!!!!!false!!!!!!!!!!!!!!!!!!!!10!!31.2595!!!!!3125!!!!!!!!!!1!!!!!!!!!10!!!8345450.00!!!!!!!!NONE!!!!!!false!!!!Numeric!!!!!!!!!10!!!9.1758!!!!!!917!!!!!!!!!!1!!!!!!!!!10!!!!!!3807.00!!CARTESIAN!!!!!!false!!!!Numeric!!!!!!!!!10!!!9.0255!!!!!!902!!!!!!!!!!1!!!!!!!!!10!!!!!!3807.00!!!!!!!EQUI!!!!!!false!!!!Numeric!!!!!!!!!10!!!9.1158!!!!!!911!!!!!!!!!!1!!!!!!!!!10!!!!!!3807.00!!!!COSINES!!!!!!false!!!!Numeric!!!!!!!!!10!!!9.8321!!!!!!983!!!!!!!!!!1!!!!!!!!!10!!!!!!3807.00!!HAVERSINE!!!!!!false!!!!Numeric!!!!!!!!!10!!!9.1272!!!!!!912!!!!!!!!!!1!!!!!!!!!10!!!!!!3807.00!!!!!!!!NONE!!!!!!false!!!!literal!!!!!!!!!10!!!0.8254!!!!!!!82!!!!!!!!!!1!!!!!!!!!10!!!!!!3781.00!!CARTESIAN!!!!!!false!!!!literal!!!!!!!!!10!!!0.5936!!!!!!!59!!!!!!!!!!1!!!!!!!!!10!!!!!!3781.00!!!!!!!EQUI!!!!!!false!!!!literal!!!!!!!!!10!!!0.6173!!!!!!!61!!!!!!!!!!1!!!!!!!!!10!!!!!!3781.00!!!!COSINES!!!!!!false!!!!literal!!!!!!!!!10!!!0.5916!!!!!!!59!!!!!!!!!!1!!!!!!!!!10!!!!!!3781.00!!HAVERSINE!!!!!!false!!!!literal!!!!!!!!!10!!!0.6289!!!!!!!62!!!!!!!!!!1!!!!!!!!!10!!!!!!3781.00!!

Why you don't want query the whole world!

Page 21: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Geo-Spatial Demo Application

Page 22: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Page 23: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Demo Application Structure

HTML Page

Javascript Server (Tomcat)

CloudSearch Ajax

Page 24: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Demo Implementation

!   JavaScript •  Ajax •  JQuery •  Google Maps API

!   Twitter Bootstrap •  css

!   Tomcat Server •  Java •  Just for forwarding of requests

•  Because XSS, that's why.

Page 25: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Querying CloudSearch $.ajax({(

(url(:("searchx",((data(:({(( ('q'(:(currentQuery,(( ('domain'(:("geoname25",(( ('return@fields'(:(returnFields.join(),(( ("rank"(:("geo",(( ("rank@geo"(:("Math.sqrt(Math.pow(Math.abs(doc.latitude_90@(12539),2)(+(

Math.pow(Math.abs(doc.longitude_180@(5784),2))"((},((dataType(:("json",((success(:(function(data)({(( (var(hits(=(data['hits'];(( (displaySearchResults(hits['hit'],(hits['found']);(( (populateMap(hits['hit']);((}(

});(

Page 26: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Wrap Up

Page 27: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Thanks for Coming!

!   Data •  http://www.geonames.org/export/

!  Slides •  On the meetup group soon

!   Sample Code •  Talk to me. ([email protected])

!   Computations •  http://www.movable-type.co.uk/scripts/latlong.html •  wikipedia

Page 28: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Querying Data !   Java/Javascript

•  Fields are latitude_90, longitude_180 •  user location is userlat, userlon

!   simple distance

rank!=!"Math.sqrt(Math.pow(Math.abs(latitude_90W("!+!userlat!+!")),2)+Math.pow(Math.abs(longitude_180W("!+!userlon!+!")),2))";!!

!   Spherical Law of Cosines

rank = "6371*Math.acos(Math.sin(" + userlat + ") * Math.sin(lat_rad/" + scale + ") + Math.cos(" + userlat + ") * Math.cos(lat_rad/" + scale + ") * Math.cos((lon_rad/" + scale + ") - " + userlon + ") )";!

Page 29: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

What CloudSearch Doesn't Do

Page 30: Geospatial Search With Amazon CloudSearch

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Issues with assuming the earth is flat.