query understanding at linkedin [talk at facebook]

41
Query Understanding and Search Assistance @ LinkedIn Abhi Lad (Engineering Lead, Search Quality)

Upload: abhimanyu-lad

Post on 15-Apr-2017

459 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Query Understanding at LinkedIn [Talk at Facebook]

Query Understandingand

Search Assistance@ LinkedIn

Abhi Lad(Engineering Lead, Search Quality)

Page 2: Query Understanding at LinkedIn [Talk at Facebook]

Outline

● Search at LinkedIn

● Goal of search

● Search assistance / Guided search

● Query understanding & rewriting

Page 3: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Page 4: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Universal search box

Page 5: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedInNavigational People search

Page 6: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Exploratory People search

FACETS

Page 7: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Exploratory People search

Page 8: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Job Search

Page 9: Query Understanding at LinkedIn [Talk at Facebook]

Search at LinkedIn

Federated Search

JOBS

PEOPLE

PEOPLE

Page 10: Query Understanding at LinkedIn [Talk at Facebook]

Goal of Search

Page 11: Query Understanding at LinkedIn [Talk at Facebook]

Help users find who or what they are looking for

with minimal effort

Goal of search

Page 12: Query Understanding at LinkedIn [Talk at Facebook]

Help users find who or what they are looking for

with minimal effort

1. Help users frame “good” queries

2. Understand the user’s underlying intent / information need

3. Rewrite the query to ensure good result set

4. Rank the results based on the user and the query

5. Provide good result attribution: snippets, highlighting

6. Propose next actions to refine results

Goal of search

Page 13: Query Understanding at LinkedIn [Talk at Facebook]

Search Assistance

Page 14: Query Understanding at LinkedIn [Talk at Facebook]

● Query Assistance: [Pre-retrieval] Help users frame their queries easily○ Autocomplete, Search suggestions in typeahead, Spellcheck, ...

● Guided Search: [Post-retrieval] Guide users through their search process○ Facet suggestions, Related searches, ...

Search Assistance

(Especially useful for exploratory queries)

Page 15: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search Suggestions

Query autocomplete

Search suggestions

Page 16: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search Suggestions

Query autocomplete => Entity detection => Search suggestions

Page 17: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search Suggestions

Query autocomplete => Entity detection => Search suggestions

Autocomplete system:

● Based on query logs● Index and retrieve using Lucene FST● Can complete last part of the query (even if entire query was previously unseen)

(Do not index people names)

Page 18: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search SuggestionsAutocomplete

Use query logs to index unigrams (tokens), bigrams, and entities (companies, titles, skills, locations)

● Compute co-occurrence statistics● Build FST for efficient “prefix => entity” retrieval

Query: [senior digital product manager sa|n francisco]

Score based on entity co-occurrence using last entity in the query (product manager):

● P(san francisco | product manager)● P(san diego | product manager)● P(sandisk | product manager)

Fall back to bigram co-occurrence:

● P(francisco | san) x P(san | manager)

Page 19: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search SuggestionsAutocomplete

● Personalization○ [ma]

■ machinist■ manager■ machine learning?

● Implicit spelling correction○ [macine lear] => machine learning

● Use similar entities to complete previously unseen queries○ [software engineer] ⇔ [software developer]○ Complete [hadoop software de|veloper] based on [hadoop software engineer]

Page 20: Query Understanding at LinkedIn [Talk at Facebook]

Autocomplete & Search SuggestionsSearch Suggestions

● Personalization

○ [hadoop]

■ “People with hadoop skills”

■ “Jobs requiring hadoop skills”

● Suggestions with multiple entities

○ [hadoop engineer san francisco]

■ “Hadoop engineer jobs in San Francisco]

Page 21: Query Understanding at LinkedIn [Talk at Facebook]

Spellcheck

● Fix obvious typos

● Help users spell names

Page 22: Query Understanding at LinkedIn [Talk at Facebook]

Spellcheck

People namesCompanies

Titles

Past queries

Page 23: Query Understanding at LinkedIn [Talk at Facebook]

Spellcheck

PROBLEM: User profiles as well as query logs contain many spelling errors

(Frequency alone is not helpful due to the long-tail distribution of entities)

Page 24: Query Understanding at LinkedIn [Talk at Facebook]

Spellcheck

PROBLEM: User profiles as well as query logs contain many spelling errors

SOLUTION: Use query chains and click data to infer correct spelling

Page 25: Query Understanding at LinkedIn [Talk at Facebook]

Spellcheck

● Better error model○ Improved metaphone (version 3)○ Platform aware: Keyboard edit distance on mobile

● Machine-learned model

● Support for partial queries○ Spellcheck-as-you-type for “Instant” search

Page 26: Query Understanding at LinkedIn [Talk at Facebook]

Facet Suggestions

Page 27: Query Understanding at LinkedIn [Talk at Facebook]

Facet Suggestions

Page 28: Query Understanding at LinkedIn [Talk at Facebook]

Facet Suggestions

● Query awareness○ For TITLE queries, suggest seniority facet○ Don’t suggest facets for name queries○ Don’t suggest redundant/conflicting facets (location facet when query has location)

● User awareness○ User profile: Users often restrict search results to their own location, industry, seniority○ User behavior: Recruiters often restrict to particular industry, location

● Document set awareness○ Ensure minimum number of results○ Bias towards higher-quality results (people, jobs, …)

Page 29: Query Understanding at LinkedIn [Talk at Facebook]

Query Understandingand

Rewriting

Page 30: Query Understanding at LinkedIn [Talk at Facebook]

Query Understanding

Page 31: Query Understanding at LinkedIn [Talk at Facebook]

Query Tagging

(Recognized entities: Names, titles, companies, schools, locations, skills)

Page 32: Query Understanding at LinkedIn [Talk at Facebook]

Query TaggerSequential model trained on the following data:

● Emission probabilities (dictionary)○ Profiles – Names, Titles, Schools, Locations○ Standardized data – Companies, Skills

● Transition probabilities○ Query logs○ Tags for query tokens inferred based on result clicks

Page 33: Query Understanding at LinkedIn [Talk at Facebook]

Query TaggerPrediction:

1. Segmentation: Maximum likelihood using unigram/bigram counts[data scientist] [linkedin] [mountain view]

2. Sequence labeling: Viterbi decoding[TITLE] [COMPANY] [LOCATION]

3. Entity linking: Dictionary[TITLE ID=435] [COMPANY ID=1337] [LOCATION ID=us:ca:mountain_view]

Page 34: Query Understanding at LinkedIn [Talk at Facebook]

Query Tagging

● Query tags used for ranking model selection○ Name query => NAME MODEL○ Title query, Skill query => TITLE MODEL○ ...

● More precise matching with documents

[software engineer google new york]

is rewritten to

[TITLE:(software engineer) COMPANY:(google) GEO:(new york)]

Using query tags:

Page 35: Query Understanding at LinkedIn [Talk at Facebook]

Entity-based filtering

BEFORE

AFTER

escapehatch

Page 36: Query Understanding at LinkedIn [Talk at Facebook]

Query Expansion

Name synonyms Job Title synonyms

Page 37: Query Understanding at LinkedIn [Talk at Facebook]

Query Expansion

● Titles○ Query reformulations

■ [programmer] => [software engineer] => CLICK■ [lawyer] => [attorney] => CLICK■ [attorney] => [legal counsel] => CLICK

● Names○ Query Reformulations○ Dictionaries

■ bob == robert■ beth == elizabeth■ ...

Page 38: Query Understanding at LinkedIn [Talk at Facebook]

Name spelling variantsName Clustering

Page 39: Query Understanding at LinkedIn [Talk at Facebook]

Name spelling variants

Two-step clustering:1. Coarse clustering – metaphone2. Finer clustering – edit distance, hand-written rules…

Each name is assigned to a clusterNC_SRIRAM = {sriram, sreeram, sriraam, shriram, …}

NC_SRIRAM

Name Clustering

Page 40: Query Understanding at LinkedIn [Talk at Facebook]

Summary

● Search assistance and guided search are critical for ensuring search success○ Good query => good results

● High degree of structure in queries and documents (profiles, jobs, …)○ Query understanding and Document understanding are crucial○ “Things not Strings” => entity-based retrieval

● Query understanding and rewriting play an important role in result set quality○ A good initial set of documents simplifies the ranker’s job○ Good result set => accurate facet counts○ Allows for sorting options other than relevance (recency, number of connections, …)

Page 41: Query Understanding at LinkedIn [Talk at Facebook]

Thank You!