crawling, indexing, ranking: make the search engine crawlers and algorithms your b**** , by andre...
TRANSCRIPT
Crawling, indexing, ranking: Make the search engine crawlers and algorithms your b****
Bucarest – September 24th, 2015
SEMdays – Andre Alpar – Twitter: @andrealpar 1
Agenda
2SEMdays – Andre Alpar – Twitter: @andrealpar
• Self introduction
• Technical SEO
• Crawling management
• Indexation management
Andre Alpar has 15 years of entrepreneurial work experience in online marketing
3SEMdays – Andre Alpar – Twitter: @andrealpar
• First own Dotcom company in 1998• Founded audiobook publishing company• Founded VC-backed marketplace• More than 30 Business Angel Investments• OMCap, OMReport, OMBook, Speaker, Author, Open-
Source-Software • 3,5 years experience in a leadership position at Rocket
Internet• Side job: CMO at Noblego.de• Since 2012: MD with PerformicsAKM3 Berlin
PerformicsAKM3 employs 250 people in Germany – international team doing performance, content and search marketing
4SEMdays – Andre Alpar – Twitter: @andrealpar
• AKM3 founded late 2009• Part of Publicis since late 2014• 150 ppl in B• Native speakers of 15 languages
in B• PerformicsAKM3 since Sept.
2015
The “big agency“ world can be slightly complex and confusing ;-)Here’s where we stand!
5SEMdays – Andre Alpar – Twitter: @andrealpar
• Pure performance marketing agency
• Over 1200 employees worldwide• Over 30 offices worldwide• Over 300 clients• In Germany: PerformicsAKM3
with 250 employees• Acquired large agencies in CZ, AR
in 2015
The group
The network
The agencies
We are not “only” an online marketing agency but also an online shop for cigars
6SEMdays – Andre Alpar – Twitter: @andrealpar
• Strategic approach for product choice• Own financing – focus on return• Two brands: Noblego and Cigarmaxx• Complementary online magazine: zigarren.org• Preview: Niche shops
Disclaimer: My style of slides is special, thoughtful and individual and if you don‘t think
it‘s funny you are just not getting it!
7SEMdays – Andre Alpar – Twitter: @andrealpar
Agenda
SEMdays – Andre Alpar – Twitter: @andrealpar
• Self introduction
• Technical SEO
• Crawling management
• Indexation management
8
This is a technical SEO presentation! There will be “dry theory“! Leave now or master it and you will be the real deal!
9SEMdays – Andre Alpar – Twitter: @andrealpar
For top performing SEO every facet of the art has to be mastered!It is “like a multiplication where each factor has to be >1“
10SEMdays – Andre Alpar – Twitter: @andrealpar
SEO
On-page
Off-page
Strategic
Content
Links
User Signals
Keywords
Information Structure
Semantics
KPIs / Reporting
Process oriented
Project oriented
Technical
1
2
3
4
Technical SEO is a part of the groundwork !The better the groundwork – the higher you can build!
11SEMdays – Andre Alpar – Twitter: @andrealpar
Effort/ Costs
Strategy
TechnicalContent
Off-page
Time
Crawling and indexation management is one of three areas that exists within technical SEO
12SEMdays – Andre Alpar – Twitter: @andrealpar
Site structure: single URL
Information architecture:
relation between different URLs
Crawling- and indexation
management: Stirring the whole
domain
Technical SEO is done simultaneously on three different levels, which must be optimized using an integrative approach
13SEMdays – Andre Alpar – Twitter: @andrealpar
Domain Level, e.g. robots.txt
Template Level, e.g. all sites that depict particular articles
Individual URL level, e.g. manually generated internal
links in the site‘s content
Crawling and indexing can be managed on each level
A simple search engine user interface in based on complex systems consisting of
carious components
14SEMdays – Andre Alpar – Twitter: @andrealpar
CrawlerResource Download,
Existence Check, Timeliness
Internet
Procurement of Website Content
IR-System(Rules)
Storeserver checks incoming Data and updates this in the Index
RepositorySaves HTML Documents
for further Analysis
User
Document IndexStorage
Query Processor(Search)
Search Mask (Input)
Search Engine
Collected Data
Exclude
URLs
Sorted DataDocument Information
/Document Metas
Send Checked Data RelevantSearch Results Search phrases
SortedSearch Results Search Queries
Search Request
Search Result
Publicationof Websites
SchedulerCo-ordinates the
Webcrawler System
Command to Crawlerfor Website crawling
Website Requests
If crawling and indexing are not managed well, good rankings are a lot harder to achieve
15SEMdays – Andre Alpar – Twitter: @andrealpar
Crawling Indexing Ranking
200+ factors and everybody knows how it works – NOT!
Necessary preconditions for great rankings and efficiency + effectiveness in SEO
To stir your crawling and indexing best be sure to have well understood what search engines are trying to figure out when
16SEMdays – Andre Alpar – Twitter: @andrealpar
Crawling Indexing Ranking
Which URLs may I analyze at all?
Which URLs are “worthy” of being considered being a search engine result?
Who’s going to be found on which position?
For both crawling as well as indexing there are limited capacities i.e. a budget for each website
17SEMdays – Andre Alpar – Twitter: @andrealpar
Agenda
SEMdays – Andre Alpar – Twitter: @andrealpar
• Self introduction
• Technical SEO
• Crawling management
• Indexation management
18
Help the lazy spider (bot / crawler) focus on your most important URLs
19SEMdays – Andre Alpar – Twitter: @andrealpar
Crawling Indexing Ranking
I won‘t crawl everything! I am lazy! You help me
focus or I‘ll pick the URLs I crawl myself!
When laying out your crawling strategy think of onions and their layers
7SEMdays – Andre Alpar – Twitter: @andrealpar
Utilize the theory of sets when defining the crawling strategy for your website
21SEMdays – Andre Alpar – Twitter: @andrealpar
www.example.com/url1.html
www.example.com/url1.html?param=magic123
All URLs that share a specific criterion
Also remember how intersections work in set theory
SEMdays – Andre Alpar – Twitter: @andrealpar 22
People who go to search marketing conferences
People in
Bucharest
The basis for thinking, planning and action is having profound knowledge of all the
URLs that do and may exist on your website
SEMdays – Andre Alpar – Twitter: @andrealpar 23
All URLs on the whole website that could ever be there
The separation process has to establish rules to differentiate those URLs that have a
role in your SEO efforts from those who don’t
24SEMdays – Andre Alpar – Twitter: @andrealpar
All URLs that search engines should crawl in order to a)Discover all pages important to your SEO effortsb)Understand which of your URLs are how important to you
Blocking less important URLs from crawling increases the probability that important
URLs will be crawled
25SEMdays – Andre Alpar – Twitter: @andrealpar
You don`t want those to be crawled!
Examples: •printable versions of pages•PDFs•Small versions of images•In shops: alternative sorting and filters
Steerable if URLs have to “survive” via:•Robots.txt•301-Redirect•(Alternatively let URLs vanish, e.g. via cryptic Ajax)Browser add-ons to help detect: •Roboxt (for Firefox)•Linkparser (for Firefox & chrome)Detection via crawlers, e.g.:• audisto.com• Onpage.org• Deepcrawl• Screaming Frog
Go make the search engine’s crawler your little puppet!
26SEMdays – Andre Alpar – Twitter: @andrealpar
The ugly thing about indexation management? Google has become less “obedient to it“.
Safest solution: Have less URLs that you do not want to have crawled!
27SEMdays – Andre Alpar – Twitter: @andrealpar
Agenda
SEMdays – Andre Alpar – Twitter: @andrealpar
• Self introduction
• Technical SEO
• Crawling management
• Indexation management
28
Indexing strategy focuses on which URLs are important for users only and which
for SEO as well
11SEMdays – Andre Alpar – Twitter: @andrealpar
Crawling Indexing Ranking
For each URL: Are you worthy of Google, Matt Cutts
and SEO or not?
To how many URLs do you want to “spread“ / distribute the link power and authority
you have?
30SEMdays – Andre Alpar – Twitter: @andrealpar
Most important ones get a lot vs. everyone
gets a tiny bit??
Put your <del>3D</del> sorting glasses on!
31SEMdays – Andre Alpar – Twitter: @andrealpar
First step in finding the right indexation strategy: sorting through all core user-oriented pages of your product
32SEMdays – Andre Alpar – Twitter: @andrealpar
Some URLs are important only for the user & internal linking, but not suited to rank for anything meaningful(also maybe insufficient content quality)Examples: Help section or Pagination pages
User + internal linking
User + internal linking + SEO
Many websites need additional landing pages for their SEO efforts for “translation” purposes
SEMdays – Andre Alpar – Twitter: @andrealpar 33
Additionallanding / product description pages + internal linking
Don‘t confuse with doorway pages.
Translation-layer between product managers and how real people search!
User + internal linking
User + internal linking + SEO
“<meta robots= …” is your best friend when it comes to indexation management
SEMdays – Andre Alpar – Twitter: @andrealpar 34
Free browser add-onwww.seerobots.com
SEO-oriented landing pages
User + internal linking
User + internal linking + SEO
The more you focus on Google with its crawling and indexing, the more deterministic SEO successes become
35SEMdays – Andre Alpar – Twitter: @andrealpar
Block from crawlers e.g. with robots.txt!
Important for users & internal linking Noindex!
Focus of your SEO efforts: Each page also has unique and valuable content Ready to rank!
The nice thing about technical SEO: Google likes it and provides great free tools for it!
36SEMdays – Andre Alpar – Twitter: @andrealpar
It‘s like half of <del>Webmaster
tools</del> Search console is for crawling and
indexation management!
To check your number of indexed sites Search Console is your best friend
37SEMdays – Andre Alpar – Twitter: @andrealpar
“Smaller“ and specific xml sitemaps can be of great value for checking if your indexation strategy is working out as planned
38SEMdays – Andre Alpar – Twitter: @andrealpar
Don‘t get confused with “Meta robots“! In 99% of the cases: stay away from “nofollow”!
39SEMdays – Andre Alpar – Twitter: @andrealpar
Does nearly make sense!
#FACEPALM!!!
Indexing
Crawling
Follow Nofollow
Index
Noindex
URLs that will rank! The core
of your SEO strategy!
URLs that are important for
internal linking only
Be careful! Does not refer to
crawling of „this“ page but only the
ones who are „following
A bit like blocked from crawling but
not top notch.
There are plenty of tools to stir crawling end indexing but not all are equally reliable
40SEMdays – Andre Alpar – Twitter: @andrealpar
Crawling Indexing
Reliable•Robots.txt
Less reliable•Meta-Robots „nofollow“
Reliable•Meta-Robots „noindex“•301-Redirects•Search console „remove URL “•Search console „Parameter Handling“
Less reliable•Rel Next/Prev•Canonical•Robots.txt (!)
Tools
Effect
• Free Podcast: www.omreport.de• Books!• OMCap: Online-Marketing-
Conference + Seminars: Oct. 7th & 8th 2015
• PPC Masters: February 2016
Interested in more SEO & Online Marketing? Conferences, Podcast & Books
41SEMdays – Andre Alpar – Twitter: @andrealpar
performicsakm3.de / performicsakm3.com
Paul-Lincke-Ufer 39/4010999 BerlinGermany
Follow us on • Twitter • Google+• Facebook
Thank you very much for your attention!Let‘s keep in touch!
42SEMdays – Andre Alpar – Twitter: @andrealpar