robots.txt and sitemap.xml creation

13
ROBOTS.TXT & SITEMAPE.XML Created By Jahid Hasan

Upload: devsteam

Post on 21-Jan-2017

355 views

Category:

Internet


0 download

TRANSCRIPT

Page 1: Robots.txt and Sitemap.xml Creation

ROBOTS.TXT & SITEMAPE.XML

Created ByJahid Hasan

Page 2: Robots.txt and Sitemap.xml Creation

WEB ROBOTS Web Robots (also known as Web Wanderers,

Crawlers, or Spiders), are programs that crawl the Webpages automatically

Search engines such as Google, Bing etc. use them to index the web contents

spammers use them to scan for email addresses

Such programs have many other uses too.

Page 3: Robots.txt and Sitemap.xml Creation

WHAT IS ROBOTS.TXT Robots.txt is a plain text file that you upload

to the root (Public Html) folder from your website’s cPanel.

Once the web spiders (ants, bots, indexers) that index your webpage search your site, they first look at that text file and process it.

More precisely, robots.txt says to the spider which pages to crawl and index and which not

Page 4: Robots.txt and Sitemap.xml Creation

THE SIMPLEST VERSION OF ROBOTS.TXTUser-agent: *Disallow:

The first line “user agent asterisk” indicates that the following lines apply to all agents/bots.

Blank after "disallow:" means that nothing is limited.

Means this robots.txt file does nothing. It allows all types of robots to see everything on the site.

Page 5: Robots.txt and Sitemap.xml Creation

SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT

 To exclude all robots from the entire server User-agent: * Disallow: /

To allow all robots full access User-agent: *Disallow:

(or just create an empty “/robots.txt" file, or dont use one at all)

Page 6: Robots.txt and Sitemap.xml Creation

SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT

To exclude all robots from part of the serverUser-agent: *

Disallow: /cgi-bin/Disallow: /tmp/ Disallow: /~joe/

To exclude a single robotUser-agent: BadBotDisallow: /

Page 7: Robots.txt and Sitemap.xml Creation

SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT

To allow a single robotUser-agent: GooglebotDisallow:

You can disallow single pages: User-agent: * Disallow: /~joe/junk.htmlDisallow: /~joe/foo.htmlDisallow: /~joe/bar.html

Page 8: Robots.txt and Sitemap.xml Creation

SOME OTHER COMMON EXAMPLES OF ROBOTS.TXT

You can specify the Sitemap location in your robots.txt file

User-agent: *Disallow: /

Sitemap: http://www.example.com/sitemap.xml

Page 9: Robots.txt and Sitemap.xml Creation

ONLINE TOOLS TO CREATE ROBOTS.TXT

http://tools.seobook.com/robots-txt/generator/

http://www.robotsgenerator.com/ http://www.mcanerin.com/EN/search-engine/r

obots-txt.asp

Page 10: Robots.txt and Sitemap.xml Creation

SITEMAP.XML

Page 11: Robots.txt and Sitemap.xml Creation

WHAT IS SITEMAP A sitemap tells search engines which pages

are available for crawling. An XML sitemap is a document that helps

Google and other major search engines have a better understanding of your website while crawling it.

A Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL. when it was last updated how often it usually changes how important it is, relative to other URLs in the

site

Page 12: Robots.txt and Sitemap.xml Creation

WHY DO YOU NEED AN XML SITEMAP

 XML Sitemaps are important for search engines.

It makes their job easier. Even if you rank in the #1 position today you

still want to take care of the maintaining your position.

Page 13: Robots.txt and Sitemap.xml Creation

THANK YOU