server-side seo (the art of making love to spiders) by boaz sasoon (similarweb)

Boaz Sasson | Head of Performancesas son@ s im i l a rwe b .com

Server-side SEO: The Art of Making Love to Spiders

Every Website

Traffic Trends Desktop & Mobile Referring Traffic

Keywords Advertising Analysis

Popular Pages

Every App

Current User Installs Active Users

Engagement per app Retention Analysis

App store optimization

Every Country

We Reveal the Secrets of Online Success

sasson

Inrrodce SW and what i do there - ToFu (seo, ppc, social, disstribution)

Do You Know?

Not Even a Spider: Googlebot is a headless browser, not a simple link crawlerCan render pages visually

Traverses the DOM

Executes AJAX, JS & forms

*Spotted in the wild as early as 2010(h t tp : / / sea rcheng ine land .com/goog les -p roposa l- fo r - c ra w l ing -a jax -may -be - l i ve -34411)

Meet the Cookie MonsterGooglebot DOES seem able to accept cookies

No longer a reliable way to Segregate/sniff bot traffic

Mr. Greedybot RequiresAccess to All Your Files,and QuicklyDo NOT block JS, CSS, scripts orimages from Googlebot if they are needed to render a page

Avoid setting crawl rate limits, if atall possible

Search bot traffic can eat your bandwidth, let them

Impact on PR/Linkjuice??Not clear how it flows on links in JS,forms, code, etc…

Consider all the links/filepaths in code, as well as text links, as part of the page’s link graph

Internal links can either promote Indexation, or pass juice, or do both

What is a crawl budget and how does is affect me?

More Pages Crawled = More Pages Indexed = More Traffic (*If site is healthy)

Depth Prob Based On:Amount of incoming links/buzz

Content creation rate/amounts

Trust – more of it results in wider and deeper crawling

Think in terms of both crawling a site (laterally) and crawling a page (depth)

Bad/low quality content getting crawled is a waste of your crawl budget

Indicator of Site’s HealthCrawl stats can be used as a quick indicator of a site’s general SEO health

How many pages indexed?

Trends?

What errors/parameters are indexed?

Life is HardNot easy to get many pages indexed quickly on a new site

What Should I Block?

Golden rule: “One filepath per specific content piece”

Low quality/trust pages

Duplicate (many forms), sorting, multi- category, non-existent,

framed content

How to Block Content & Some Misconceptions

Better to delete crap than to block it

Assume that anything in the DOM is technically accessible to modern search bots, even though it may not pass juice

Robots.txt only works on internally

Don't block with both robots.txt and meta robots together

Best to block with meta robots & delete via GWT (*renew every 6 months)

X-robots tag, is in document HEAD, useful for PDFs, XML, etc…

Play around with blocking elements via frames, tabs, forms, animations,

lazyloading

Redirect Logic

Links lose juice with each hop

Catch as many instances as possible in single rules

Default redirect should be 301

If no other options, use meta refresh set to zero seconds for 301s, & 5

seconds for 302s

Google Really Dislikes Broken Links

Check using a scheduled 404 report or spider

Scan on a regular basis

Sessions, parameters and cookies

Do NOT:

Print session IDs/parameters on filepaths in code

Pass session IDs via filepaths

Be mindful of parameters used, each is considered to be a unique page

Use cookies to pass session info

If no other alternative, block parameters via GWT

Supercookies (flash, browser cache, fingerprinting, E Tags, etc)

URL Structures

Filepaths can be flat, deep, or with several parameters if needed, all

seem to work fine

Have a clear hierarchy in terms of directory structure, use internal links

to emphasize relationships

Be consistent - all lower case, hyphens not underscores, avoid empty

spaces

Think of clickability of the filepath when seen by a human

Avoid foreign language encoding on URLs

Two Great & Free Tools for Crawling

IIS SEO Toolkithttp://www.iis.net/downloads/microsoft/search-engine-

optimization-toolkit

Xenuhttp://home.snafu.de/tilman/xenulink.html

Quick Tests:

Technical, Penalty, or Market?

47/20

Thank YouBoaz Sasson

sas son@ s im i l a rwe b .com

server-side seo (the art of making love to spiders) by boaz sasoon (similarweb)

Marketing