getting the most out of type-ahead/autocomplete - lavacon 2015 propsoal by brian eisenberg
Post on 15-Jul-2015
113 Views
Preview:
TRANSCRIPT
1Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Essential site search features and
functionality and how they can be used
to deliver an improved search experience
Advances in Search & Findability
2Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Outline
• Introduction
• What’s this webinar about and why should I care?
Leveraging taxonomies for search
Search issues on LOC.GOV
• Overview of essential search features and functionality
Setting up search
Search analytics
Faceted search
Auto complete
Redirects
Auto Correct
Sort
Compare
• Q&A
3Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Getting the most out of type-ahead/autocomplete
In this lecture, attendees will learn about the latest and greatest in type-
ahead and autocomplete technology and functionality.
Brian Eisenberg, Associate Search & Taxonomy Consultant – Earley & Associates
5 years experience leading search & taxonomy
Experience with Endeca, ATG, Omniture
Introduction
4Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Why is this important?
• Identify some search issues we see on special library sites
• Show the features and tools we use on popular search engine platforms to
address relevancy
• Talk about how search benefits from well designed taxonomies
• Hopefully you’ll get some ideas on ways these features can be leveraged in your
organizations
5Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Integrating a taxonomy for search can improve the results and experience in several ways:
Auto-completion using taxonomy entities
Refinement of results using the full taxonomy (faceted search/browse)
Synonym expansion of content based on taxonomy
Ability to expand results or begin navigation of the taxonomy
Leveraging Taxonomies for Search
Pre search filtering in auto
complete based on taxonomy
Post
search
filters
(faceted
search)
6Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Conducting a misspelled search query on loc.gov
User is prevented from seeing the thousands of great results available at LOC
because there is not a simple spellcheck feature in place.
Opportunity is lost to teach the searcher the correct spelling.
Search on LOC.GOV
7Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Search on LOC.GOV
• Selecting the top result from LOC SERP shows that even that was not relevant,
a ‘false positive’, which would have been eliminated by using some of the core
relevancy ranking features we’ll be discussing.
8Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Query was automatically corrected based on probability algorithm and desired results
delivered
• Similar tools available on leading open source and commercial search engines
Autocorrect on Google
9Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
A brief overview in setting up search and measuring quality
Setting up Site Search
10Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Most search engines come out of the box with relevancy scoring based on a popular model, like
TF/IDF which extends a simple Boolean model.
Search is sometimes set up by IT alone which often leads to poor results in terms of relevancy
and UX.
Relevant search results happen when understanding the business and user needs, content
available, and customizing the search experience to support those goals.
Test, review, iterate.
A few search ranking factors to highlight:
Date of publication
Boost documents that have been published more recently
Number of matching terms (Min Match)
Can define the number and/or percent of terms from a multi-term query that must match document to be
considered relevant (e.g if 1 to 2 terms in query both must match, if 3 terms than 2 of 3 must match, if 4 or
more than 75% or greater must match)
Field weight boosts
Can give preference to matches in title or header of page over matches against body which is more indicative
of the ‘aboutness’ of the document
Document Type boosts
Can give preference to certain types of content (e.g. buying guides over products over photos)
Term proximity
Determine how far apart terms should appear for the document to be considered relevant (ex. Franklin
rosevelt)
Developing the search algorithm
11Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Once documents are indexed we can begin to customize the search algorithm by
defining the fields that are searched and the relative importance of the fields via boosts.
• When a search is run the algorithm scores documents and results are returned based on
score.
<<field name="merchantName" type="text" indexed="true" stored="true" omitNorms="false" boost="20.0"/>
<field name="displayName" type="text" indexed="true" stored="true" omitNorms="false" boost=“10.0"/>
<field name="merchantMetaKeywords" type="text" indexed="true" stored="true" omitNorms="false" boost="0.5"/>
<field name="protectedKeywords" type="text" indexed="true" stored="true"/>
<field name="keywordPrefixes" type="text" indexed="true" stored="true"/>
<field name="merchantAdditionalKeywords" type="text" indexed="true" stored="true"/>
<field name="merchantSearchKeywords" type="text" indexed="true" stored="true" omitNorms="false" boost="5.0"/>
Solr algorithm example
Field searched Field boost
12Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Easy to use interfaces to review and edit relevancy factors and control search features.
• Solr Relevancy Workbench Endeca Workbench
Solr Relevancy Workbench
Search manager UIs for relevancy tuning
13Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Search analytics
• Great analytics tools out there, any of which should be used to gain insight.
• Google Analytics is free, easy to install, and provides robust, actionable data:
• How often are users searching and what are they searching for?
• What searches are leading to 0 results?
• What are users doing following their search? Exiting or clicking through?
14Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Search analytics – null results
• Below is a set of queries which led to zero or null results pulled from search analytics for
an online coupon site.
• Searches were then categorized as to whether a synonym for thesaurus expansion was
needed, or there is a content gap, or other.
15Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Features and functionality we use most often to improve the
relevancy of search results delivered.
Key Search Features
16Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• http://www.musiciansfriend.com/search?sB=r&Ntt=tambourine
• A site search usually results in thousands of results, and one of the best ways a
user can sort through them is faceted search, aka refinement types.
• These filters are usually present in the left rail on search results pages. Notice the
“subcategories” and “narrow by” options in the left rail, they are all ways to refine
the search results:
Faceted search - a.k.a Search Refinements:
17Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Faceted search can also be applied to static pages, such as this category page. A
deeper level of detail can be applied to refinement types that are specific to the
category. Notice at the bottom of the left rail, shell material and snare size:
• Global refinements:
• Local refinements:
Static page refinements: www.musiciansfriend.com/snare-drums
18Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Auto complete - LOC
• Autocomplete available on the redesigned LOC site but a number of the suggestions are confusing and it doesn’t
appear to have been optimized before rollout.
• Goal of auto complete is not only to help users avoid misspellings and get results more quickly but guide them to
better queries.
?
19Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• There are different approaches to type-ahead/auto complete. Results sorted by
matching brands, products, and top searches (taken from internal search logs):
Auto complete - Guitar Center
20Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Type-ahead list that includes top searches, results for each of the top searches with
thumbnails, matching categories, brands, buying guides, and installations:
Auto complete – Home Depot
21Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Auto complete - LinkedIn
Results
clustered by
type with
images and
logos.
People in
your network
are listed
first.
22Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Redirect to any landing page- Creates controlled custom experience, can be
applied to any keywords: e.g. “guitar” on Musicians Friend does not complete a
search but rather redirects to the category page (doubles the conversion rate):
Auto redirects to landing pages: www.musiciansfriend.com/guitars
23Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• http://www.musiciansfriend.com/search?sB=r&Ntt=chmes
Auto correct misspellings and approximate matches:
“Chmes”
becomes
“chimes” and
provides the
same search
results
automatically.
24Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• http://www.musiciansfriend.com/search?sB=r&Ntt=cord
Thesaurus entries, e.g. “cord” equals “cable”:
Querying “cord” or
“cable” provide the
same search results.
These are manually
entered versus auto-
correction which is
automatic.
25Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Endeca workbench thesaurus entry: one way
Notice here that “oysters”, “lobster”,
and “shellfish” are entered as a
subset of “seafood”. Only “seafood”
is searched for all terms.
26Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• User can sort search results based on any specified metadata so they have control
in seeing search results ordered by their desired criteria
Search results page sorts, AKA “SERP sorts”
27Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Start with search for “lawn” on Home Depot and choose to “compare” two items:
Compare functionality
28Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Show search results side by side to compare specs. These attributes are any
defined metadata fields that can be global or unique to the category:
Compare functionality
29Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
Scroll down for a deeper comparison:
Compare item records
side by side. Any
specified metadata can
be displayed here.
30Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Products, articles, media, reviews all in one search. Search and web
platforms are able to create their own indices and display results from all
sources in all formats.
• Notice buying guides and products guides mixed in with products,
categories, and brands:
Aggregate content
31Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Choose your store to see custom results:
Personalization and contextualization
32Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.
• Searches another site on what you just searched:
New development: Search ad
top related