graph theory #searchlove the theory that underpins how all search engines work @kelvinnewman
TRANSCRIPT
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
ETC.
graph theory !
The theory that underpins how all search engines work
@kelvinnewman
Graph Theory
The most important theory in search that nobody talks about
Kelvin Newman
@kelvinnewman
JD Hancock
OrganiserBrightonSEO / Content Marketing
Show / MeasureFestThree Free (and awesome) Conferences
Strategy DirectorSiteVisibility
A digital agency specialising in retail, travel and financial services
Co-FounderClockwork Talent
Decent Digital Recruitment
I might get in trouble for this
JD Hancock
JD Hancock
I’ve been let into a secret future beta of Google, and
I’m going to reveal it to you
Joking aside; I think FB GraphSearch is a great indicator of
the future of G
JD Hancock
as it helps us better
understand one of the
theories that underlies all
search engines
JD Hancock
graph theory !
graph theory !
Hugely Important
graph theory !
Rarely Spoken About
LINKGRAPH
LINKGRAPH
SOCIALGRAPH
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
OPEN GRAPH
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
SORT OF
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
ETC.
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
ETC.
graph theory !
Hugely Important
but we’ve been distracted, from what our job really is
which is understanding how the search engines fundamentally work
jronaldlee
LINKGRAPH
SOCIALGRAPH
KNOWLEDGEGRAPH
ETC.
graph theory !
is central to that understanding
this presentation may contain maths
Benson Kua
will - cambridge maths
Dana Lookadoo - Yo! Yo! SEO
Will CritchlowMaths MA - University of
Cambridge
Tom AnthonyPhD Artificial Intelligence
University of Hertfordshire
Kelvin NewmanMedia Studies
University of Sussex
I’m no computer scientist or mathematician
jlwo
“a mathematical model for any system involving a binary relation”
graph theory !
Frank Harary, 1969
“perhaps even more than to the contact
between mankind and nature, graph theory
owes to the contact of human beings between
each other”Dénes König, 1936
http://www.slideshare.net/digitalmethods/gephi-rieder-23834788
Vertices or Nodes
dominicotine
Nodes are Nouns
Edges are Verbs
basic graph visualisation
This Graph is Isomorphic of the other one, aka it’s the same but looks different
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1
V2
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0
V2
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1
V2
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1 1 1
V2
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1 1 1 0 0
V2
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1 1 1 0 0
V2 1 0
V3
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1 1 1 0 0
V2 1 0 0 0 0 0
V3 1 0 0 0 0 0
V4
V5
V6
Matrix view of graph
V1 V2 V3 V4 V5 V6
V1 0 1 1 1 0 0
V2 1 0 0 0 0 0
V3 1 0 0 0 0 0
V4 1 0 0 0 1 0
V5 0 0 0 1 0 1
V6 0 0 0 0 1 0
Or maybe?
Blogger 1 Blogger 2 Blogger 3 Blogger 4 Blogger 5 Blogger 6
Blogger1 0 1 1 1 0 0
Blogger 2 1 0 0 0 0 0
Blogger 3 1 0 0 0 0 0
Blogger 4 1 0 0 0 1 0
Blogger 5 0 0 0 1 0 1
Blogger 6 0 0 0 0 1 0
Cardinality is the number of Nodes or Vertices in a Graph
Prayitno/
Degrees of Vertex is how many edges a vertex has.
chedder
Trees & CircuitsOur Graph here is known as a tree,
because you can’t loop back on yourself.
If you could loop back on yourself it would be
known as a circuit
This is interesting to think about in the context of your site, or an area of the link graph
Watch PatrickJMT’s Graph Theory Videos
http://patrickjmt.com/graph-theory-an-introduction/
What’s the first thing you teach your team?
http://i.imgur.com/PGE2D2n.gif
http://computationalculture.net/article/what_is_in_pagerank
For me it is PageRank
http://computationalculture.net/article/what_is_in_pagerank
Jim Seward is a legend
What is PageRank?
http://i.imgur.com/aNXqGNT.gif
A set of rules which can be used to give a numerical weighting to
assess the importance of document within linked data set
A set of rules which can be used to give a numerical weighting to
assess the importance of document within linked data setnodes
it is not
PageRank is used for than the Algo
natalielucier
Understand Lung Cancerjasleen_kaur
http://www.news-medical.net/news/20130326/Algorithm-similar-to-Google-PageRank-helps-map-spread-of-lung-cancer.aspx
Rank Scientific Significance
http://bulib4research.blogspot.co.uk/2008/11/eigenfactor-scimago-journal-rankings.html
Predict Traffic
deepsanhttp://iopscience.iop.org/1742-5468/2008/07/P07008/
three different surfers
Chris Hunkeler
three different surfers
Chris Hunkeler
Random Surfer
Random Surfer
Reflects the chance that the random surfer will leave the site through a link chosen at random,
so all equally likely, and therefore valuable
three different surfers
Chris Hunkeler
Reasonable Surfer
Reasonable Surfer
The reasonable surfer model supposes that some links are more likely to be clicked on and therefore should be
given more value.
three different surfers
Chris Hunkeler
Intentional Surfer
Intentional Surfer
The intentional surfer model supposes that links which ‘actually’ receive the
most links should be given more value.
http://en.wikipedia.org/wiki/PageRank#The_intentional_surfer_model
A lot has changed at Google, but it will always be a search
engine which relies upon PageRank; which is a practical
application of Graph Theory
Insert Audience Participation
Hands up who thinks FB GraphSearch is the best
search engine in the world?
Just me?
Not here to convince you GraphSearch will catch on
but...
If the area of this slide represents all the traffic on the
internet
This much is Facebookhttp://mashable.com/2010/11/19/facebook-traffic-stats/
And every thing in white is the rest of the
internet
Google, YouTube, Wikipedia, The Daily
Mail, etc.
your website, my website, her website etc.
If anyone can build a Google-Killer
it’s Facebook...
There’s a fundamental difference between Facebook &
is about...
documents and links
JD Hancock
is about...
JD Hancock
things and relationships
this difference is subtle but
huge
but I think it works better for the web as we
know it
JD Hancock
Google are trying to catch-up but will struggle
zoom images
JD Hancock
Facebook’s data has a far more explicit
structure than traditional web text
it’s not that tricky
for Google to parse
“I Like Nerf Guns”
porkist
they could even have a go at “I was at Bodeans on Poland Street for Lunch Yesterday”*
*if you mark it up in the right way
R_Savvy
but has a much harder job understanding “Kelvin is married to Carolyn”
Facebook knows that happened in 2007
And who attended the ceremony
And when we got engaged
etc.
Google have to infer structure
Facebook know the structure
On GraphSearch you’re not really making a search.
You’re just filtering a structured database of all the
data Facebook has.
On GraphSearch you’re not really making a search.
You’re just filtering a structured database of all the
data Facebook has.
But it’s a bloody big database
JD Hancock
1 Billion Users Every Month
240 Million Photo’s Per Day
2.7 Billion Likes Everyday
People share billions of pieces of content everyday
One trillion connections of a thousand different types
1,000,000,000,000
http://mashable.com/2013/07/08/facebook-launch-graph-search/
Every User, Page, Photo, Post & Place is a Node
https://thetribe.s3.amazonaws.com/ferris.gif
Every friendship, checkin, tag or like is an Edge
http://maxlutz.com/blog/wp-content/uploads/2013/05/coffee2.gif
Each Node has Meta-Data like description, this how
the old FB Search “worked”
GraphSearch Allows you search the Edges as well
as the Nodes
JD HancockJD Hancock
GraphSearch makes it easy to find nodes that are connected to another node by searching for an edge-type combined with an input node.
E.g.:■Your friends: friend:10003■People who live in new york: lives-in:111■People who like downtown abbey: like:222
‘Facebook use query-independent signals to come up with a numeric value for importance.
This value is called the “static rank” of the entity.’
JD Hancock
What makes up static rank is still up for debate, but sensibly could be informed by
the elements of Edgerank
aka
the (old name for) newsfeed algo
Affinity
Weight
Decay
The value of legitimate likes from well connected
people just increased
There’s also been lot going on at
not a new update
Martin Cathrae
but a new paradigm
introducing Knowledge Graph*
*and things not technically Knowledge Graph but sort of along the same lines
introducing Knowledge Graph*
The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query.Amit Singhal, Google
Knowledge Graph is part of a huge change in how Google deliver
search results
I’m now going to give you lots of examples of changes in the way
Google present results, not all of them are truly ‘Knowledge
Graph’ but do indicate a general shift in the
way they present results.
There’s more than 85 of these features that Dr. Pete from Moz has
documentedhttp://www.slideshare.net/crumplezone/
beyond-10-blue-links-the-future-of-ranking
But they’re just for informational queries... Right?
a change in purpose:
help find pageshelp find answers
a change in purpose:
help find pageshelp find answers
You can no longer rely on Google to send you traffic
or even tell you about it
Alex E. Proimos
JD Hancock
for nearly a year iPhone search
traffic appeared as direct
and we’re rapidly approaching the point where we have no data on keyword traffic
Search isn’t about keywords anymore
It's about entities.
chukgawlikphotography
Entities are normally,
people, places, brands etc
JD Hancock
but can be any ‘thing’ which has a relationship to another ‘thing’
JD Hancock
how can you make money if nobody ever goes to your site?
JD Hancock
You may need to revisit your business model
kennymatic
I love the Business Model Canvas
http://en.wikipedia.org/wiki/Business_Model_Canvas
sit down and ask yourself could your business have
an api
as every business is really just a
database and a front end
JD Hancock
and Google wants to become that front-end
JD Hancock
So what can I do?
Familiarize yourself with Freebase
http://www.freebase.com/
And DBpedia
http://wiki.dbpedia.org/Datasets
It’s amazing the data they have
yaph
If any of your keywords contain entities you MUST be prepared
http://i.imgur.com/GLCC0bd.gif
Use BlueNod to Visualise Social Networks
http://bluenod.com/
Different communities manifest themselves in different ways
http://www.beautifullife.info/wp-content/uploads/2012/12/11/05.gif
Play with VisualDataWeb
http://www.visualdataweb.org/relfinder
No schema? Create one/extend one
http://schema.org/docs/extension.html
Follow Peter Mika @pmika
Read Matthew J. Brown’s Mozcon Deck
http://www.slideshare.net/MatthewBrownPDX/strings-to-things-the-move-to-semantic-seo-mozcon-2013
Watch WSDM VideosWeb Search and Data Mining Conference
http://videolectures.net/wsdm/
Do Good Marketing
tl;dr
SEO is changing it’s not about optimising your website for search engines, it’s about optimising your business
for search engines