robots.txt and sitemap.xml creation
TRANSCRIPT
ROBOTS.TXT & SITEMAPE.XML
Created ByJahid Hasan
WEB ROBOTS Web Robots (also known as Web Wanderers,
Crawlers, or Spiders), are programs that crawl the Webpages automatically
Search engines such as Google, Bing etc. use them to index the web contents
spammers use them to scan for email addresses
Such programs have many other uses too.
WHAT IS ROBOTS.TXT Robots.txt is a plain text file that you upload
to the root (Public Html) folder from your website’s cPanel.
Once the web spiders (ants, bots, indexers) that index your webpage search your site, they first look at that text file and process it.
More precisely, robots.txt says to the spider which pages to crawl and index and which not
THE SIMPLEST VERSION OF ROBOTS.TXTUser-agent: *Disallow:
The first line “user agent asterisk” indicates that the following lines apply to all agents/bots.
Blank after "disallow:" means that nothing is limited.
Means this robots.txt file does nothing. It allows all types of robots to see everything on the site.
SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT
To exclude all robots from the entire server User-agent: * Disallow: /
To allow all robots full access User-agent: *Disallow:
(or just create an empty “/robots.txt" file, or dont use one at all)
SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT
To exclude all robots from part of the serverUser-agent: *
Disallow: /cgi-bin/Disallow: /tmp/ Disallow: /~joe/
To exclude a single robotUser-agent: BadBotDisallow: /
SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT
To allow a single robotUser-agent: GooglebotDisallow:
You can disallow single pages: User-agent: * Disallow: /~joe/junk.htmlDisallow: /~joe/foo.htmlDisallow: /~joe/bar.html
SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT
You can specify the Sitemap location in your robots.txt file
User-agent: *Disallow: /
Sitemap: http://www.example.com/sitemap.xml
ONLINE TOOLS TO CREATE ROBOTS.TXT
http://tools.seobook.com/robots-txt/generator/
http://www.robotsgenerator.com/ http://www.mcanerin.com/EN/search-engine/r
obots-txt.asp
SITEMAP.XML
WHAT IS SITEMAP A sitemap tells search engines which pages
are available for crawling. An XML sitemap is a document that helps
Google and other major search engines have a better understanding of your website while crawling it.
A Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL. when it was last updated how often it usually changes how important it is, relative to other URLs in the
site
WHY DO YOU NEED AN XML SITEMAP
XML Sitemaps are important for search engines.
It makes their job easier. Even if you rank in the #1 position today you
still want to take care of the maintaining your position.
THANK YOU