how to spot a bear - an intro to machine learning for seo
TRANSCRIPT
![Page 1: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/1.jpg)
@TomAnthonySEO
April 2015 - BrightonSEO
HOW TO SPOT A BEAR A Machine Learning Introduction for SEOs
![Page 2: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/2.jpg)
![Page 3: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/3.jpg)
Can you define a list of rules for spotting
bears?
![Page 4: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/4.jpg)
1) Four legs.
Let’s start with:
![Page 5: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/5.jpg)
Bear!
![Page 6: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/6.jpg)
List of rules (first half):(when I asked in the office)
1. Four legs. 2.Breathes. 3.Furry. 4. Long snout.
![Page 7: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/7.jpg)
Bear!
![Page 8: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/8.jpg)
List of rules:
1. Four legs. 2.Breathes. 3.Furry. 4. Long snout.
5. Brown. 6.Not always brown. 7. Mammal. 8.No tail.
(how do you spot a mammal?!)
![Page 9: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/9.jpg)
Let’s check our rules…
![Page 10: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/10.jpg)
Rules say:
Bear
![Page 11: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/11.jpg)
Rules say:
Harmless Furry Thing (less than 4 legs)
![Page 12: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/12.jpg)
Rules say:
Odd Grey Creature (no long snout)
![Page 13: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/13.jpg)
Remove ‘long snout’, and rules say:
Bear (Extra-terrestrial bear?!)
![Page 14: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/14.jpg)
Our rules suck.
![Page 15: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/15.jpg)
A different bear: Google’s Panda
![Page 16: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/16.jpg)
Can you define a list of rules for spotting spammy pages?
Same problem as bears!
![Page 17: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/17.jpg)
NBED GOOD PAGE
Good page
![Page 18: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/18.jpg)
NBED GOOD PAGE
Commercial page, still good.
![Page 19: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/19.jpg)
Hrm…
![Page 20: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/20.jpg)
Seems legit…
![Page 21: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/21.jpg)
WTF!
![Page 22: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/22.jpg)
Google can’t write rules.
![Page 23: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/23.jpg)
What we can do is identify spammy or
non-spammy attributes.
![Page 24: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/24.jpg)
Are there adverts on the page?
Are there lots of spelling mistakes?
Is there little text content?
Are there Calls To Action in ALL CAPS?
Some Possible Spam Signals
![Page 25: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/25.jpg)
Smooth segue to:
Machine Learning
![Page 26: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/26.jpg)
List of pages we’ve manually classified.
List of attributes that we believe are important to
classifying pages.
![Page 27: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/27.jpg)
adverts on page?
more than 5 spelling
mistakes?
less than 200 words of content?
CTA in ALL CAPS?
site A Y Y Y Y Spam Site
site B N N Y Y Good Site
site C Y N N N Spam Site
site D N Y N Y Spam Site
site E N Y N N Good Site
Example Data
![Page 28: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/28.jpg)
Neural Networks: A Perceptron
Inputs Output
Neuron
![Page 29: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/29.jpg)
Neural Networks: A Perceptron
Inputs Output
1
if:inputs >= 1
output TRUE
0
1
0
0.5
0.5
0.5
0.5
![Page 30: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/30.jpg)
1 x 0.5 = 0.50 x 0.5 = 01 x 0.5 = 0.50 x 0.5 = 0
1______
Total:Output: TRUE
1
if:inputs >= 1
output TRUE
0
1
0
0.5
0.5
0.5
0.5
TRUE
![Page 31: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/31.jpg)
1 x 0.5 = 0.50 x 0.5 = 00 x 0.5 = 00 x 0.5 = 0
0.5______
Total:Output: FALSE
1
if:inputs >= 1
output TRUE
0
0
0
0.5
0.5
0.5
0.5
FALSE
![Page 32: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/32.jpg)
1 x 0.5 = 0.50 x 0.5 = 01 x 0.4 = 0.40 x 0.5 = 0
0.9______
Total:Output: FALSE
1
if:inputs >= 1
output TRUE
0
1
0
0.5
0.5
0.4
0.5
FALSE
![Page 33: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/33.jpg)
adverts on page?
more than 5 spelling
mistakes?
less than 200 words of content?
CTA in ALL CAPS?
site A Y Y Y Y Spam Site
site B N N Y Y Good Site
site C Y N N N Spam Site
site D N Y N Y Spam Site
site E N Y N N Good Site
Example Data
![Page 34: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/34.jpg)
Untrained Neuron
Is site spam?
adverts
>5 spelling mistakes
< 200 words content
CTA in ALL CAPS
if:inputs >= 1
output TRUE
0.5
0.5
0.5
0.5
![Page 35: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/35.jpg)
Training
adverts
>5 spelling mistakes
< 200 words content
CTA in ALL CAPS
if:inputs >= 1
output TRUE
0.5
0.5
0.5
0.5
0
0
1
1
SPAM!
![Page 36: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/36.jpg)
Training
adverts
>5 spelling mistakes
< 200 words content
CTA in ALL CAPS
if:inputs >= 1
output TRUE
0.5
0.5
0.6
0.6
![Page 37: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/37.jpg)
After training: 4/5 sites correct
Is site spam?
adverts
>5 spelling mistakes
< 200 words content
CTA in ALL CAPS
if:inputs >= 1
output TRUE
0.2
0.7
0.4
0.5
![Page 38: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/38.jpg)
ANNs typically have many neuronssource: http://www.teco.edu/~albrecht/neuro/html/node18.html
![Page 39: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/39.jpg)
Deep Learning
![Page 40: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/40.jpg)
Humans are good at pattern matching
![Page 41: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/41.jpg)
We’re better than machines…source: Pawan Sinha (http://web.mit.edu/bcs/sinha/papers/sinha_recog_review_NN.pdf)
![Page 42: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/42.jpg)
ML can learn to recognise cats from examples
![Page 43: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/43.jpg)
Deep Learning learns more like us
![Page 44: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/44.jpg)
Ok, so what does this have to do with Google?
![Page 45: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/45.jpg)
PandaML based algorithm updates
![Page 46: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/46.jpg)
Old index Caffeine
Caffeine - Infrastructure Update (we believe this made Panda+Penguin possible)
![Page 47: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/47.jpg)
Hummingbird is to ??? as
Caffeine is to Panda+Penguin
![Page 48: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/48.jpg)
Hummingbird Is it similar to Caffeine? Is it the basis for new natural language algorithms?
![Page 49: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/49.jpg)
Where is Google going next with ML?
![Page 50: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/50.jpg)
Idea
Image Search 2.0
![Page 51: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/51.jpg)
Image Labelling
![Page 52: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/52.jpg)
Image Labelling
![Page 53: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/53.jpg)
Video Labelling
![Page 54: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/54.jpg)
ML Generated Image Descriptions
“Two pizzas sitting on top of a stove top oven”
![Page 55: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/55.jpg)
Natural Language Faceted Search
Idea
![Page 56: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/56.jpg)
‘show me olympic athletes' ‘show me the women'
![Page 57: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/57.jpg)
“Find well rated vegetarian cooking books written after 1990”
How about:
![Page 58: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/58.jpg)
Idea
Factual Accuracy as a
Ranking Factor
![Page 59: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/59.jpg)
Fact CheckingKnowledge Vault
![Page 60: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/60.jpg)
Idea: Bad Facts
NBED- shot of Google talking about this shit
Estimating ‘Trustworthiness’
![Page 61: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/61.jpg)
Idea
Entirely ML Generated Algorithm?
![Page 62: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/62.jpg)
http://dis.tl/ml-algo
![Page 63: How to Spot a Bear - An Intro to Machine Learning for SEO](https://reader034.vdocuments.us/reader034/viewer/2022052705/58f9b3ad760da3da068bd887/html5/thumbnails/63.jpg)
Thanks! :)
@TomAnthonySEO