server-side seo (the art of making love to spiders) by boaz sasoon (similarweb)
TRANSCRIPT
Boaz Sasson | Head of Performancesas son@ s im i l a rwe b .com
Server-side SEO: The Art of Making Love to Spiders
Every Website
Traffic Trends Desktop & Mobile Referring Traffic
Keywords Advertising Analysis
Popular Pages
Every App
Current User Installs Active Users
Engagement per app Retention Analysis
App store optimization
Every Country
We Reveal the Secrets of Online Success
Not Even a Spider: Googlebot is a headless browser, not a simple link crawlerCan render pages visually
Traverses the DOM
Executes AJAX, JS & forms
*Spotted in the wild as early as 2010(h t tp : / / sea rcheng ine land .com/goog les -p roposa l- fo r - c ra w l ing -a jax -may -be - l i ve -34411)
Meet the Cookie MonsterGooglebot DOES seem able to accept cookies
No longer a reliable way to Segregate/sniff bot traffic
Mr. Greedybot RequiresAccess to All Your Files,and QuicklyDo NOT block JS, CSS, scripts orimages from Googlebot if they are needed to render a page
Avoid setting crawl rate limits, if atall possible
Search bot traffic can eat your bandwidth, let them
Impact on PR/Linkjuice??Not clear how it flows on links in JS,forms, code, etc…
Consider all the links/filepaths in code, as well as text links, as part of the page’s link graph
Internal links can either promote Indexation, or pass juice, or do both
Depth Prob Based On:Amount of incoming links/buzz
Content creation rate/amounts
Trust – more of it results in wider and deeper crawling
Think in terms of both crawling a site (laterally) and crawling a page (depth)
Bad/low quality content getting crawled is a waste of your crawl budget
Indicator of Site’s HealthCrawl stats can be used as a quick indicator of a site’s general SEO health
How many pages indexed?
Trends?
What errors/parameters are indexed?
Assume that anything in the DOM is technically accessible to modern search bots, even though it may not pass juice
Google Really Dislikes Broken Links
Check using a scheduled 404 report or spider
Scan on a regular basis
Have a clear hierarchy in terms of directory structure, use internal links
to emphasize relationships