amazon cloudsearch meetup august 15, 2012

35
© 2012 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc. Amazon CloudSearch Meetup August 15, 2012

Upload: thanh

Post on 25-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Amazon CloudSearch Meetup August 15, 2012. Welcome. Housekeeping Slides will be posted Drawing. Agenda. Introduction to CloudSearch Jon Handler, CloudSearch Solutions Architect Relevance and Ranking Jack Conradson , Software Engineer Case Study: Reddit Keith Mitchell, Programmer - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearchMeetup

August 15, 2012

Page 2: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Welcome

Housekeeping

Slides will be posted

Drawing

Page 3: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

AgendaIntroduction to CloudSearch• Jon Handler, CloudSearch Solutions Architect

Relevance and Ranking• Jack Conradson, Software Engineer

Case Study: Reddit• Keith Mitchell, Programmer

Q&A

Page 4: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Introduction to CloudSearch

Page 5: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Introduction to Search

Page 6: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Inverted Index

US President

Page 7: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex Relevance

Page 8: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex relevance

Page 9: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex relevance

Page 10: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex relevance

Page 11: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex relevance

Page 12: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange SearchingFielded SearchingBoolean QueriesComplex relevance

Page 13: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search On The WebRelevance/RankingFacetingRange-SearchingFielded SearchingBoolean QueriesComplex Relevance

Page 14: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch

Page 15: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Amazon CloudSearch

Fully-managed, full-featured search service Automatically scales for data & trafficHandles both structured and unstructured dataNear real-time indexingUp and running in less than 1 hour

Page 16: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

SEARCH CLIENTwww.example.com

SEARCH DEVELOPER

Search API Console

SEARCH ENDPOINT DOCUMENT SERVICE ENDPOINT CONFIGURATION SERVICE ENDPOINT

ConfigurationAPI

CommandLine Tools

ConsoleDocumentService API

CommandLine Tools

Console

SEARCH SERVICESearch Documents

DOCUMENT SERVICEAdd Documents

Update Documents

Delete Documents

Create Domains

Configure Domains

Delete Domains

CONFIGURATION SERVICE

SendSearchRequests

SearchResults

Use the Search Tester

SendDocuments

Create andManage Domains

ACCESS CONTROL ACCESS CONTROL ACCESS CONTROL

Amazon CloudSearch Architecture

Page 17: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Automatic Scaling: Data & Traffic

SEARCH INSTANCEIndex Partition n

Copy 1

SEARCH INSTANCEIndex Partition 2

Copy 2

SEARCH INSTANCEIndex Partition n

Copy 2

SEARCH INSTANCEIndex Partition 2

Copy n

SEARCH INSTANCE

DATADocument Quantity and Size

TRAFFICSearch Request Volume and Complexity

Index Partition nCopy n

SEARCH INSTANCE

SEARCH INSTANCEIndex Partition 1

Copy 1

SEARCH INSTANCEIndex Partition 2

Copy 1

SEARCH INSTANCEIndex Partition 1

Copy 2

SEARCH INSTANCEIndex Partition 1

Copy n

Index Partition 1Copy 1

Page 18: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Example: Build Your Playlist

Page 19: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Use Case

Million song dataset http://labrosa.ee.columbia.edu/millionsong/

Search documents are songs• Attributes: title, artist names, years, genre, artist familiarity

We’ll use this to create a “Build Your Playlist” web application.

Page 20: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Page 21: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Demo

Page 22: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

SDF Documents[

{"type":"add", "id": "sombzze12a8c134960",

"version":5, "lang":"en", "fields":

{"title":"Cajun Twisters", "artist_name":"Adam Ant", "year":"1993", "song_id":"sombzze12a8c134960", "artist_familiarity":449425, "genre":["alternative", "electronic", "instrumental", "rock"] }

}, … ]

Page 23: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Configuration

cs-configure-from-sdf• Analyzes source files for fields and types. Heuristic

Individually

Page 24: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Page 25: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Upload Documents

Page 26: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

PHP Integration$results =

file_get_contents(http://search-mn-songs-5bbplyghbb5tk257rsb7iamlsy." ."us-east-1.cloudsearch.amazonaws.com" . "/2011-02-01/search?q=" . $keyword . $bqParam . "&return-fields=title,artist_name,year,genre_result,artist_familiarity&"."facet=year_facet,genre&" . "facet-year_facet-sort=alpha&" ."facet-genre-sort=alpha&" ."facet-genre-top-n=100000&" . "facet-year_facet-top-n=100000&" ."t-year=1985..&" ."t-title=a..&" ."rank=-" . $rank);$resultsObj = json_decode($results);

Page 27: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Common Feature Requests

Field Weighted RelevanceAdditional Regions and LanguagesHigh AvailabilityTighter integration with other AWS services (Dynamo/S3)Support For Very Large Use CasesGeo Sorting

Page 28: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Field-Weighted Values

Page 29: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Field Weights Use Case

Music Search• Dataset composed of the following fields:

• Title• Album• Artist• Lyrics• Popularity

Results without field weights• May end up with results based heavily on lyrics when searching for an artist’s

name (Guns & Roses vs. roses, guns)Results with field weights• Possibly apply a greater weight to artist than lyrics

Page 30: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

FWV in Rank Expressions

Rank expressions can be used within CloudSearch to customize relevance computations for better returned search results.• song_relevance = text_relevance + popularity

Natural to extend rank expressions to allow field-weighted values using JSON objects.• song_relevance = cs.text_relevance({weights: {artist=3.0,

song=4.0}, default_weight=0.5} + 0.5*popularity

Page 31: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Query-Time Rank Expressions

Each set of defined rank expressions may take a while to be deployed to your search domain.Query-time rank expressions would allow rank expressions to be defined during a query without having to wait• q=‘guns roses’&rank-qtre=cs.text_relevance({weights:

{artist=3.0, song=4.0}, default_weight=0.5}&return-fields=qtre&rank=-qtre

Page 32: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

ResourcesAmazon CloudSearch Overview Pagehttp://aws.amazon.com/cloudsearch/• FAQs• Community Forum• Documentation & Getting Started Tutorial (IMDb)

Demos and Tutorials• What Is Amazon CloudSearch • Introducing Amazon CloudSearch (Features)• Building a Search Application Using Amazon CloudSearch• Getting Started Tutorial

Page 33: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Upcoming Events

Enterprise Search Summit/KMworld, DC, Oct. 17-19Bay Area Amazon CloudSearch Group: Oct. 24

Las Vegas, November 27-29

Page 34: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Q&A

Page 35: Amazon  CloudSearch Meetup August 15, 2012

© 2012 Amazon.com, Inc. and its affiliates.  All rights reserved.  May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Thank You