how google works: a ranking engineer's perspective by paul haahr
TRANSCRIPT
![Page 1: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/1.jpg)
How Google WorksA Ranking Engineer’s PerspectivePaul HaahrSMX WestMarch 3, 2016
![Page 2: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/2.jpg)
GoogleSearchToday
![Page 3: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/3.jpg)
Mobile First
![Page 4: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/4.jpg)
Features
• spelling suggestions
• autocomplete
• related searches
• related questions
• calculator
• knowledge graph
• answers
• featured snippets
• maps
• images
• videos
• in-depth articles
• movie showtimes
• sports scores
• weather
• flight status
• package tracking
• …
![Page 5: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/5.jpg)
Ranking
![Page 6: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/6.jpg)
10 Blue Links
![Page 7: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/7.jpg)
What documents do we show?
What order do we show them in?
![Page 8: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/8.jpg)
Lifeof aQuery
![Page 9: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/9.jpg)
Two Parts of a Search Engine• Ahead of time (before the query)• Query processing
![Page 10: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/10.jpg)
Before the Query• Crawl the web• Analyze the crawled pages
• Extract links• Render contents• Annotate semantics• …
• Build an index
![Page 11: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/11.jpg)
The Index• Like the index of a book• For each word, a list of pages it appears on• Broken up into groups of millions of pages
• At Google, these are called “shards”• 1000s of shards for the web index
• Plus per-document metadata
![Page 12: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/12.jpg)
Query Processing• Query understanding and expansion
• Retrieval and scoring
• Post-retrieval adjustments
![Page 13: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/13.jpg)
Query Understanding• Does the query name any known entities?
• [san jose convention center]• [matt cutts]
• Are there useful synonyms?• [gm trucks]: “gm” → “general motors”• [gm corn]: “gm” → “genetically modified”
• Context matters
![Page 14: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/14.jpg)
Retrieval and Scoring• Send the query to all the shards• Each shard
• Finds matching pages• Computes a score for query+page• Sends back the top N pages by score
• Combine all the top pages• Sort by score
![Page 15: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/15.jpg)
Post-retrieval adjustments• Host clustering, sitelinks• Is there too much duplication?• Spam demotions, manual actions• …
![Page 16: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/16.jpg)
What do ranking engineers do? (version 1)
Write code for those servers
![Page 17: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/17.jpg)
ScoringSignals
![Page 18: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/18.jpg)
Signal• A piece of information used in scoring• Query independent – feature of page
• PageRank, language, mobile friendliness, ...
• Query dependent – feature of page & query• keyword hits, synonyms, proximity, …
![Page 19: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/19.jpg)
What do ranking engineers do? (version 2)
Look for new signals.
Combine old signals in new ways.
![Page 20: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/20.jpg)
Metrics
![Page 21: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/21.jpg)
“If you can not measure it, you can not improve it.”
–Lord Kelvin (sort of)
![Page 22: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/22.jpg)
Key Metrics• Relevance
• Does a page usefully answer the user’s query?• Ranking’s top-line metric
• Quality• How good are the results we show?
• Time to result (faster is better)• ...
![Page 23: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/23.jpg)
Higher results matter• “Position weighed”• “Reciprocally ranked” metrics
• Position 1 is worth 1• Position 2 is worth ½• Position 3 is worth ⅓• Position 4 is worth ¼• …
![Page 24: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/24.jpg)
What do ranking engineers do? (version 3)
Optimize for our metrics
![Page 25: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/25.jpg)
But where do themetrics come from?
![Page 26: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/26.jpg)
Evaluation
![Page 27: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/27.jpg)
How do we measure ourselves?• Live Experiments• Human Rater Experiments
![Page 28: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/28.jpg)
LiveExperiments
![Page 29: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/29.jpg)
Live Experiments• A/B experiments on real traffic
• Similar to what many other websites do
• Look for changes in click patterns• Harder to understand than you might expect
• A lot of traffic is in one experiment or another
![Page 30: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/30.jpg)
Interpreting Live Experiments• Both pages P1 and P2 answer user’s need• For P1, answer is on the page• For P2, answer is on the page and in the snippet• Algorithm A puts P1 before P2 user clicks on P⇒ 1 “good”⇒• Algorithm B puts P2 before P1 no click “bad”⇒ ⇒
• Do we really think A is better than B?
![Page 31: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/31.jpg)
HumanRaterExperiments
![Page 32: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/32.jpg)
Human Rater Experiments• Show real people experimental search results• Ask how good the results are• Ratings aggregated across raters• Published guidelines explain criteria for raters• Tools support doing this in an automated way
![Page 33: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/33.jpg)
![Page 34: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/34.jpg)
Result Rating Task
![Page 35: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/35.jpg)
Two Scales• Needs Met
• Does this page address the user’s need?• Our current relevance metric
• Page Quality• How good is the page?
![Page 36: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/36.jpg)
MobileFirst
![Page 37: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/37.jpg)
Mobile First Rating
“Needs Met rating tasks ask [raters] to focus on mobile user needs and think
about how helpful and satisfying the result is for the mobile users.”
![Page 38: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/38.jpg)
How do we make it mobile-centric?• More mobile queries than desktop in samples• Pay attention to user’s location• Tools display mobile user experience• Raters visit websites on smartphones
![Page 39: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/39.jpg)
NeedsMetRating
![Page 40: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/40.jpg)
Needs Met Rating• Fully Meets• Highly Meets• Moderately Meets• Slightly Meets• Fails to Meets
(Following examples are from Rater Guidelines)
![Page 41: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/41.jpg)
FullyMeets
![Page 42: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/42.jpg)
(Very)HighlyMeets
![Page 43: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/43.jpg)
HighlyMeets
![Page 44: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/44.jpg)
(More)HighlyMeets
![Page 45: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/45.jpg)
ModeratelyMeets
![Page 46: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/46.jpg)
SlightlyMeets
![Page 47: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/47.jpg)
Fails toMeet
![Page 48: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/48.jpg)
PageQualityRating
![Page 49: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/49.jpg)
Page Quality Concepts• Expertise• Authoritativeness• Trustworthiness
![Page 50: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/50.jpg)
High Quality Pages• A satisfying amount of high quality main content
• The page and website are expert, authoritative, and trustworthy for the topic of the page
• The website has a good reputation for the topic of the page
![Page 51: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/51.jpg)
Low Quality Pages• The quality of the main content is low
• There is an unsatisfying amount of main content
• The author does not have expertise or is not trustworthy or authoritative for the topic
• The website has a negative reputation
• The secondary content is distracting or unhelpful
![Page 52: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/52.jpg)
OptimizingOurMetrics
![Page 53: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/53.jpg)
Ranking engineers• Team of a few hundred computer scientists• Focused on our metrics and signals• Run lots of experiments• Make lots of changes
![Page 54: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/54.jpg)
Development Process• Idea• Repeat until ready:
• Write code• Generate data• Run experiments• Analyze
• Launch report by Quantitative Analyst• Launch review
![Page 55: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/55.jpg)
What do ranking engineers do? (version 4)
Move results with good ratings up.
Move results with bad ratings down.
![Page 56: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/56.jpg)
WhatGoesWrong?
(And how do we fix it?)
![Page 57: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/57.jpg)
Two kinds of problems• Systematically bad ratings• Metrics don’t capture things we care about
![Page 58: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/58.jpg)
BadRatings
![Page 59: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/59.jpg)
[texas farm fertilizer]• User is looking for a
brand of fertilizer
• Unlikely to want to go to the manufacturer’s headquarters
• Rater average called map of headquarters almost “Highly Meets”
![Page 60: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/60.jpg)
Patterns of Losses• Look for things we think are bad in results
• Either live or from experiments
• Create examples for rater guidelines
![Page 61: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/61.jpg)
New rater example
![Page 62: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/62.jpg)
MissingMetrics
![Page 63: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/63.jpg)
![Page 64: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/64.jpg)
Low Quality Content in 2009-2011• Lots of complaints about low quality content• But our relevance metric kept going up
• Low quality pages can be very relevant• We thought we were doing great
• ⇒ We weren’t measuring what we needed to
![Page 65: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/65.jpg)
Quality Metric• Gets directly at the quality issue• Not the same as relevance• Enabled development of quality-related signals
![Page 66: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/66.jpg)
When theMetricsMissSomething
![Page 67: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/67.jpg)
What do ranking engineers do? (version 5)
Fix rater guidelines ordevelop new metrics
(when necessary)
![Page 68: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/68.jpg)
Thank you!
![Page 69: How Google Works: A Ranking Engineer's Perspective By Paul Haahr](https://reader035.vdocuments.us/reader035/viewer/2022062218/586fda8d1a28ab18428b5dad/html5/thumbnails/69.jpg)
Questions?