how to improve indexing and crawlability for seo by bill hunt
TRANSCRIPT
Technically SpeakingImproving Indexing
Bill Hunt
Back Azimuth Consulting
@billhunt
• 20 years experience in Enterprise Search• Big Site/Big Brand/Global Focus• Co-Author – Search Engine Marketing Inc.• Solve Complex Workflow problems• Developed Global Keyword Management Suite
3rd Edition Fall 2014
E: [email protected]: www.back-azimuth.comT: 860.604.8063Twitter: @billhunt
4 Fundamentals of SEO
SEO
Indexibility
Relevance Authority
Clickability
Improve Indexabilty
If spider’s cannot get it they can’t score it
• Improving crawl efficiency on large sites
• Reducing errors through error checking
rules in XML Site Map creation
• As development gets more complicated
increased need to ensure indexing
Typical Problem
• Large IT company with 400,000 PDF files
• Broken down to 1k per site map
• Only 1 URL indexed
Submit One but Canonical Another
• Site Map Entry • <url><loc>http://clueless.com/h20195/v2/getDocument.aspx?docname=4AA3
-0930FRE</loc><changefreq>weekly</changefreq><priority>0.5</priority></url>
• Page Canonical Entry• <link rel="canonical"
href="http://clueless.com/h20195/v2/GetDocument.aspx?docname=4AA3-0930FRE" />
Removed upper case – 98% indexed in 3 weeks with 4,257% increase in traffic from indexed pages
Typical Problem
• Site with ~ 50 million pages
• 12.5 million in XML Site Maps
14.6 Million Errors
Challenges• URL Case
– Site had both upper and lower case versions of URL’s
– Site links to lower case but canonical to upper case and upper to lower
• City/State and State/City and City/ST– All versions of URL’s were available and submitted on different XML files
– The City/St version was 302 redirect to lower case with canonical to upper
– Base URL and pagination URL’s also submitted on site maps
• Page with no offers – Nearly 2 million Soft 404 errors due to no offers
• Canonicals & Pagination– Consultant suggested “canonical tag” but it was to itself resulting in 2 to
200 duplicate pages
Indexing AJAX and Facets
• AJAX used to enable “Faceted Browse”• 100% invisible to search engines • 2 million+ SKU variations of data to be indexed • Individual product pages are standard HTML
Solutions• Mandated all URL’s must be lower case• Mandated /attribute1/attribute2/attribute3 consistency• Mandated if no result or <5 add noindex & nofollow• Mandated custom 404 with 404 header• Dynamically built XML site map based on taxonomy logic
– Added canonical validation– Added noindex validation– Added 200 status code validation
• Added Sitemap Error Review to weekly workflow
• ~48 million pages indexed and 2,500% increase in traffic• All 2.1 million pages indexed driving significant long tail
traffic
Monitor Index Rates
Monitor Index Rates
Use the info: parameter to test inclusion vs. WMT
Monitor critical page inclusions
Note: UK and AU pages not indexing same with MX and ES pages
Site Map URL's Indexed Not Indexed Percent Not PLP NI Avg Reindex
AR 139 133 25 18% 2 7
AU 139 5 134 96% 54 23
BR 139 136 3 2% 0 7
DE 139 125 14 10% 1 7
ES 139 12 127 91% 54 12
FR 139 98 41 29% 16 7
JP 139 133 6 4% 0 12
MX 139 8 131 94% 58 14
UK 139 5 134 96% 55 23
US 189 188 1 1% 0 7
Leverage HREF on Global Sites
• Duplicate content from country pages significantly hurts performance
• Helps search engines understand which version is for which country/language
• Replaces global pages for local pages
• Works within 3 to 5 days
• You can use it in the page BUT is more flexible in XML Site Map form
Biggest HREF Mistakes
1. Not listing the primary page in the list
You must list the page itself in the list of URL’s
Biggest HREF Mistakes
2. Incorrect Country and Language
• For UK entry is en-GB not en-UK
• If your German page covers multiple
German countries don’t use de-DE
• For Japan entry is jp-JA not jp=JP
Biggest HREF Mistakes
3. Not Breaking out Regions
• There is no LATAM or EMEA or ME• www.bigco.com/lamerica_nsc_cnt_amer/en/home.html
Cannot use /es since there is a Spain site
<xhtml:link rel="alternate" hreflang="es-CR" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-BZ" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-HN" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-NI" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-PA" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-SV" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" /> <xhtml:link rel="alternate" hreflang="es-GT" href="http://www.bigco.com/lamerica_nsc_cnt_amer/es/home.html" />
Success Story
• Before HREF Implementation– Absolut Vodka had global or US page ranking in Top 3
in 27 countries
– While ranking well, few clicks in Asia and Latin America due to English page ranking
• After HREF Implementation– 100% of 38 local markets had local language home
page in Top 3
– Exponential traffic increases in every market
Takeaways
• While not sexy and cutting edge – attention to detail for indexing has significant gains
• Ensure less than 1% of URL’s have a problem
• Work with IT teams to clean infrastructure
• Leverage HREF to align global and local pages
• Invest in developing tools that fit your needs