google rankbrain: what it does, how it works, what it means by marcus tober
TRANSCRIPT
#SMX #22A @MarcusTober
Google RankBrain
-‐ WHAT IT DOES
-‐ HOW IT WORKS
-‐ WHAT IT MEANS
MARCUS TOBER
SMX West March 2, 2016
#SMX #22A @MarcusTober
Searchmetrics Made with love in Berlin
More than 220 passionate people
Innovator in SEO Software since 2005
#SMX #22A @MarcusTober
Marcus Tober Founder and CTO of Searchmetrics
In love with SEO and SEARCH since 2001
Study of computer science In Berlin, so I´m the Techie!
#SMX #22A @MarcusTober
Machine Lear n ing or A I?
#SMX #22A @MarcusTober
Machine Learning
(ML) ≠ AI
#SMX #22A @MarcusTober
Machine Learning (ML)
Deep Learning Artificial Intelligence
An algorithm that improves over
time
Aims to bridge the gap between ML and AI – solves more complex
problems
Human like intelligence
Spectrum of Intelligence
#SMX #22A @MarcusTober
#SMX #22A @MarcusTober
• Facebook photo recognition
• Email spam filters
• Database mining
• Music or movie recommendations
iTunes / Spotify / Netflix
• Solving games (e.g. chess) – IBM‘s Deep
Blue vs Garry Kasaparov
Common Applications of Machine Learning
#SMX #22A @MarcusTober
• More than 2,500 years old
• Played by 40m worldwide
• Deemed uncrackable by Machine
Learning/AI (until AlphaGo)
Go – The limits of machine learning
#SMX #22A @MarcusTober
Is Go really that hard? (well actually, yes…)
1050 4x1079 – 4x1081 10171
Number of...
in a chess game in the observable universe in game of Go
…possible moves …possible moves …atoms
#SMX #22A @MarcusTober
Not so simple
That’s…
1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000
…possible positions!
#SMX #22A @MarcusTober
• Traditional A.I methods -‐ which analyze all
possible positions –failed
• AlphaGo users uses deep neural networks
across 12 different network layers
• One neural network, selects the next move to
play. The other neural network, the “value
network”, predicts the winner of the game.
Deep learning - AlphaGo
#SMX #22A @MarcusTober
Unders tand ing RankBra in
#SMX #22A @MarcusTober
Legend in Neural Network Research. World renowned A.I. researcher.
Invented some of the core algorithms of Deep Learning back in the 80s
Working with Google
since 2011
New novel concept:
Thought Vectors
Geoffrey Hinton
Capturing thoughts…
#SMX #22A @MarcusTober
IMAGINE EMPTY SPACE
#SMX #22A @MarcusTober
Every word gets a position in space
Oregon
Salem
Sacramento
California
Visualizing RankBrain
#SMX #22A @MarcusTober
Query sentences get a position in space.
What’s the weather going to be like in California?
Weather Forecast California
Visualizing RankBrain
Using training data, similar query sentences (with similar results) are closely positioned
#SMX #22A @MarcusTober
Queries and results get a position in space
Q
Result Scoring
R R
Good results rank better based on proximity
Visualizing RankBrain
#SMX #22A @MarcusTober
Traditonal Ranking Factors can no longer
make sense of organic rankings.
Searchmetrics Hypothesis
#SMX #22A @MarcusTober
Searchmetrics Hypothesis
We said that:
• RankBrain concentrates on relevant content
• RankBrain uses thought vectors to map relevant
results to queries
• This relevance score is then used to help order rankings
Q
R R
#SMX #22A @MarcusTober
Brief study background
• Top 30 Results for Google US
• ~400,000 datapoints
• 3 keyword sets: Loan, E-‐Commerce, Health
• Approach: Discover which ranking factors are
most important
• Emulate RankBrain by adding scores to
understand the content relevance.
http://www.google.com/
Cash advance fresno ca
Examples: Backlinks & Internal Links
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Rank: 5 Rank: 12
17 backlinks 21 internal links
52,000 backlinks 443 internal links
Keyword: “Cash advance fresno ca” LOAN
#SMX #22A @MarcusTober
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
E-Commerce Health Loan
Backlinks
Positive correlation
Negative correlation
#SMX #22A @MarcusTober
0
50
100
150
200
250
300
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
E-Commerce Health Loan
Number of Internal Links
http://www.google.com/
fast cash credit card
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Rank: 6 Rank: 17
5,685 Backlinks No KW in Title
59,780 Backlinks KW in Title
Keyword: “fast cash credit card” LOAN
Keyword in Title
#SMX #22A @MarcusTober
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
E-Commerce Health Loan
Keyword in Title
10% of pages in “loans” only have the keyword in title-‐tag
http://www.google.com/
natural detox
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Rank: 5 Rank 26
Word count = 3180 Internal links count = 375 Interactive elements = 398
Word count = 6087 Internal links count = 395 Interactive elements = 625
Keyword: “natural detox” HEALTH
Word Count & Interactive Elements
#SMX #22A @MarcusTober
1,000
1,500
2,000
2,500
3,000
3,500
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
E-Commerce Health Loan
Word Count
#SMX #22A @MarcusTober
0
50
100
150
200
250
300
350
400
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
E-Commerce Health Loan
Number of Interactive Elements
#SMX #22A @MarcusTober
Why do “traditional” ranking factors fail to explain these examples?
#SMX #22A @MarcusTober
• We emulated RankBrain and gave search
results a relevance score Q
R
R
• This score is based on how relevant a result is to
a query
• We used around 25 relevance ranking factors
to assess relevance
Relevance Ranking Factors
#SMX #22A @MarcusTober
• 9 out of top 10 e-‐commerce
websites for this keyword have
an add to cart function above
the fold
Keyword: “security camera system” E-‐COMM
Introducing Relevance
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Keyword: “security camera system” E-‐COMM
• Rank 9 does not.
• However, it has the highest
relevance score of the top 30 -‐ and
thats why it ranks.
Introducing Relevance
http://www.google.com/
Best bluetooth headphones
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Rank: 2 Rank 26
Internal links count = 170 Relevance = Highest in Top 30
Internal links count = 412 Relevance = low
Keyword: “best bluetooth headphones” E-‐COMM
Introducing Relevance
http://www.google.com/
natural detox
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
Rank: 5 Rank 26
Word count = 3180 Internal links count = 375 Interactive elements = 398
Word count = 6087 Internal links count = 395 Interactive elements = 625
Keyword: “natural detox” HEALTH
Introducing Relevance
Content with a high relevance score:
• matches user intention
• Is logically structured and
comprehensive
• Offers a good user experience
• Deals with topics holistically
Being Relevant
https://www.maakeentaart.nl/swiss-meringue-botercreme-maken-en-andere-botercreme/
#SMX #22A @MarcusTober
• Top ranking factors are different depending on
keyword set
Relevance Ranking
Factor
Traditional Ranking
Factor
Top 10 Ranking Factors by Category
• Relevance ranking factors dominate results
across all keywords sets
• All previous examples can be explained by
having a higher relevance score
• This score overpowered other ranking factors,
meaning these pages ranked highly
Key Findings
#SMX #22A @MarcusTober
Out look : th i s i s where we
are go ing
#SMX #22A @MarcusTober
• SEO is as important as ever, but it’s changing
T h i s i s whe r e
we a r e g o i n g
• RankBrain is not used on all queries: for example
short/popular queries with well known results
are not filtered by RankBrain your content
matches user/query intent
• Relevance is crucial for good rankings – RankBrain can detect how relevant your content
is
• Make sure your content matches user/query
intent
Outlook for SEO
#SMX #22A @MarcusTober
The Future of Search: An abundance of redundance
• Incremental improvements through powerful
data insights
Machine learning and Searchmetrics Share a Philosophy:
• Using Machine and Deep Learning to make
sense of complex data.
• A data driven approach is only way to make
sense of the abundance of redundance online.
• This applies for content, too.
#SMX #22A @MarcusTober
Thank you for your attention!
#SMX #22A @MarcusTober
SEE YOU AT THE NEXT #SMX!
THANK YOU!