interesting data from the 2011 ranking factors

30
Interesting Findings from the 2011 Search Ranking Factors Presented for SMX Advanced, Seattle Rand Fishkin, SEOmoz CEO, June 2011 The full data is now online at http://bit.ly/rankfactors2011 This deck is available online at http://bit.ly/randsmxdeck

Upload: rand-fishkin

Post on 15-Jan-2015

48.445 views

Category:

Technology


0 download

DESCRIPTION

Rand Fishkin's slide deck from SMX Advanced Seattle 2011 showing several interesting datapoints from the 2011 Search Engine Rankings Factors survey and correlation data on SEOmoz.

TRANSCRIPT

Page 1: Interesting Data from the 2011 Ranking Factors

Interesting Findings from the

2011 Search Ranking Factors

Presented for SMX Advanced, SeattleRand Fishkin, SEOmoz CEO, June 2011

The full data is now online at http://bit.ly/rankfactors2011

This deck is available online at http://bit.ly/randsmxdeck

Page 2: Interesting Data from the 2011 Ranking Factors

Understanding, Interpreting & Using

Survey Opinion Data

Everybody’s wrong sometimes, but there’s a lot we can learn

from the aggregation of opinions

Page 3: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlMany thanks to all who contributed their time to take the survey!

#1: Opinions are Not Fact(these are smart people, but they can’t know everything about Google’s rankings)

#2: Not Everyone Agrees(standard deviation can help show us the degree of consensus)

#3: We Had 132 Contributors(but this group could be biased as they were editorially selected via a nomination process)

Page 4: Interesting Data from the 2011 Ranking Factors

Understanding, Interpreting & Using

Correlation Data

This is powerful, useful information, but with that power comes

responsibility to present it accurately

Page 5: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlMore details, including complete documentation and the raw dataset is now available at http://www.seomoz.org/article/search-ranking-factors

Methodology

10,271 Keywords, pulled from Google AdWords US Suggestions(all SERPs were pulled from Google in March 2011, after the Panda/Farmer update)

Top 30 Results Retrieved for Each Keyword(excluding all vertical/non-standard results)

Correlations are for Pages/Sites that Appear Higher in the Top 30(we use the mean of Spearman’s correlation coefficient across all SERPs)

Results Where <2 URLs Contain a Given Feature Are Excluded(this also holds true for results where all the URLs contain the same values for a feature)

Page 6: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html

Dolphins who swim at the front of the pod tend to have larger dorsal fins, more muscular tails and more damage on their flippers. The first two might have a causal link, but the damaged flippers is likely a result of swimming at the front (i.e. having damaged flippers doesn’t make a dolphin a better front-of-the-pod-swimmer). Likewise, with ranking correlations, there’s probably many features that are correlated but not necessarily the cause of the positive/negative rankings.

Correlation & Dolphins

Page 7: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlJust because a feature is correlated, even very highly, doesn’t necessarily mean that improving that metric on your site will necessarily improve your rankings.

Correlation IS NOT Causation

But, will adding more characters to the HTML code of a page

increase rankings? Probably not.

Earning more linking root domains to a URL may indeed increase that page’s ranking.

Page 8: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlStandard error won’t be reported in this presentation, but it’s less than 0.0035 for all of Spearman correlation results (so we can feel quite confident about our numbers)

How Confident Can We Be in the Accuracy of these Correlations?

Because we have such a large data set, standard error is extremely low. This means even for small correlations, our estimates of the mean

correlation are close to the actual mean correlation across all searches.

Page 9: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlA rough rule of thumb with linear fit numbers is that they explain the number squared of the system’s variance. Thus, a factor with correlation 0.3 would explain ~9% of Google’s algorithm.

Do Correlations in this Range Have Value/Meaning?

A factor w/ 1.0 correlation would explain 100% of Google’s algorithm across 10K+ keywords

Most of our data is in this range

Page 10: Interesting Data from the 2011 Ranking Factors

Are You Ready for Some Data?!

Page 11: Interesting Data from the 2011 Ranking Factors

The Changing Landscape of

Google’s Ranking Algorithm

These compare opinion/survey data from 2009 vs. 2011

Page 12: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlIn 2009, link-based factors (page and domain-level) comprised ~65% of voters’ algorithmic assessment

Page 13: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlIn 2011, link-based factors (page and domain-level) have shrunk in the voters’ minds from ~65% to ~45% of algorithmic components. Note: because the question options changed slightly (and more options were added), direct comparison may not be entirely fair.

Page 14: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlWhile there was some significant contention about issues like paid links and ads vs. content, the voters nearly all agreed that social signals and perceived user value signals have bright futures.

What Do SEOs Believe Will Happen w/ Google’s Use of Ranking Features in the Future?

Page 15: Interesting Data from the 2011 Ranking Factors

Diversity + Anchor Text:

Well Correlated with Higher Rankings

These metrics are based on links that point specifically to the ranking page

Page 16: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html

This data is exactly what an SEO would expect – the more diverse the sources, the greater the correlation with higher rankings. These numbers are relatively similar to our June 2010 correlation data (from http://www.seomoz.org/blog/google-vs-bing-correlation-analysis-of-ranking-elements).

In the rest of this deck, we’ll use linking c-blocks

as a reference point, hence the red

Page 17: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlPartial anchor text matches have greater correlation than exact match. This might be correlation only, or could indicate that the common SEO wisdom to vary anchor text is accurate.

Correlations of Page-Level, Anchor Text-Based Link Data

No Surprise: Total links (including internal) w/ anchor text is less well-correlated than

external links w/ anchor text

Page 18: Interesting Data from the 2011 Ranking Factors

Comparing

Page + Domain-Level Link Signals

These metrics are based on links that point to anywhere on the ranking domain

Page 19: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlDomain-level link data is surprisingly similar to page-level link data in correlation

Correlation of Domain-Level Link Data

Suggests page-level + domain-level link signals have relatively

similar weighting, just as voters predicted.

Page 20: Interesting Data from the 2011 Ranking Factors

Have Exact Match Domains

Lost their Lustre?

These signals are based on keyword-use in the root domain name.

Page 21: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlThis suggests that Google’s statements last year about devaluing exact match domains may have not only been serious, but are already getting into the results.

Spearman’s Correlation with Google Rankings forExact Match Domain Names June 2010 vs. March 2011

Exact match domains (.com and all TLDs)

have both fallen considerably in the

past 10 months

Page 22: Interesting Data from the 2011 Ranking Factors

Is Google Evil?

These metrics come from a variety of places in the dataset, but mostly on-page stuff.

Page 23: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlThis data suggests that, by-and-large, there’s not much “evil” in Google’s rankings, at least, none that correlation research will reveal. Good job keeping it honest, Googlers!

Google has said that linking externally is

good; slow pages are bad; and using

Google services won’t give any special

benefit. This data supports those

statements!

Page 24: Interesting Data from the 2011 Ranking Factors

Social Signals

These signals are based on data from users of Twitter, Facebook &

Google Buzz via their APIs

Page 25: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlAlthough we didn’t ask voters for a cutoff on what they believe matters vs. doesn’t, I suspect many/most would have said that Google Buzz and Digg/Reddit/SU aren’t used in the rankings.

Most Important Social Media-Based Factors(as voted on by 132 SEOs)

Curious: For Twitter, voters felt authority matters more, while for Facebook, it’s raw quantity (could

be because GG doesn’t have as much access to FB graph data).

Page 26: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlAlthough voters thought Twitter data / tweets to URLs were more influential, Facebook’s metrics are substantially better correlated with rankings. Time to get more FB Shares!

Correlation of Social Media-Based Factors(data via Topsy API & Google Buzz API)

Amazing: Facebook Shares is our single highest

correlated metric with higher Google rankings.

Page 27: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlFor most link factors, 99%+ of results had data from Linkscape; for social data, this was much lower, but still high enough that standard error is below 0.0025 for each of the metrics.

Percent of Results (from our 10,200 Keyword Set) in Which the Feature Was Present

It amazed me that Facebook Share data was present for 61% of pages

in the top 30 results

Page 28: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlTwitter’s correlation wanes dramatically, but Facebook features, while lower, still appear quite influential. Facebook likely deserves much more SEO attention than it currently receives.

Correlation of Social Metrics, Controlling for Links(i.e. Are pages ranking well because of links and social metrics are simply good predictors of linking activity?)

Raw Correlations Correlations Controlling for Links

Page 29: Interesting Data from the 2011 Ranking Factors

http:/googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.htmlBe a responsible user of correlation data – thank you!

IMPORTANT!Don’t Misuse or Misattribute Correlation Data!

Think of correlation data as a way of seeing features of sites that rank well, rather than a way of seeing what metrics search engines are actually measuring and counting.

A well-correlated metric can often be its own reward, even if it doesn’t directly impact search engine rankings. Virtually all the data in this report reflect the best practices of inbound marketing overall – and using the data to help support these is an excellent application

Thanks much!Rand

Page 30: Interesting Data from the 2011 Ranking Factors

Q+A

Rand Fishkin

CEO & Co-Founder, SEOmoz

• Twitter: @randfish

• Blog: www.seomoz.org/blog

• Email: [email protected]

• Slides: http://bit.ly/randsmxdeck

The full report is now available!http://bit.ly/rankfactors2011