A robots.txt file is basically a notepad file which tells which pages or content of your site you don’t want accessed by search engine crawlers.
To create a robots file simply create a txt file with the name of robots.txt (Note: For robots.txt file should save with the name of robots.txt)
Syntex :
It is simply based on two keywords "User-agent" & "Disallow". User-agent are the search engine robots or crawler software. Disallow is a command for the user-agent that tells it not to access a particular URL. On the other hand, to give Google access to a particular URL that is a child directory in a disallowed parent directory, then you can use a third key word, Allow.
User-agent: [the name of the robot the following rule applies to]
Disallow: [the URL path you want to block]
Allow: [the URL path in of a subdirectory, within a blocked parent directory, that you want to unblock]
or
For Google
User-agent: Googlebot
Disallow: /
Some Example to clearly define how robots.txt works :
1) To exclude all robots from the entire server
User-agent: *
Disallow: /
2) To allow all robots complete access
User-agent: *
Disallow:
3) To exclude all robots from part of the server
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
4) To exclude a single robot
User-agent: BadBot
Disallow: /
5) To allow a single robot
User-agent: Google
Disallow:
User-agent: *
Disallow: /
6) To exclude all files except one
User-agent: *
Disallow: /~joe/stuff/
You can also test your robots.txt file through the robots.txt Tester.
robots.txt tester is a google webmaster tool which tests robots.txt file
How robots.txt tester work
To create a robots file simply create a txt file with the name of robots.txt (Note: For robots.txt file should save with the name of robots.txt)
Syntex :
It is simply based on two keywords "User-agent" & "Disallow". User-agent are the search engine robots or crawler software. Disallow is a command for the user-agent that tells it not to access a particular URL. On the other hand, to give Google access to a particular URL that is a child directory in a disallowed parent directory, then you can use a third key word, Allow.
User-agent: [the name of the robot the following rule applies to]
Disallow: [the URL path you want to block]
Allow: [the URL path in of a subdirectory, within a blocked parent directory, that you want to unblock]
or
For Google
User-agent: Googlebot
Disallow: /
Some Example to clearly define how robots.txt works :
1) To exclude all robots from the entire server
User-agent: *
Disallow: /
2) To allow all robots complete access
User-agent: *
Disallow:
3) To exclude all robots from part of the server
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
4) To exclude a single robot
User-agent: BadBot
Disallow: /
5) To allow a single robot
User-agent: Google
Disallow:
User-agent: *
Disallow: /
6) To exclude all files except one
User-agent: *
Disallow: /~joe/stuff/
You can also test your robots.txt file through the robots.txt Tester.
robots.txt tester is a google webmaster tool which tests robots.txt file
How robots.txt tester work
- From the Webmaster Tools Home page, choose the site whose robots.txt file you want to test.
- Expand the Crawl heading on the left dashboard, and select the robots.txt Tester tool.
- Make changes to your live robots.txt file in the text editor.
- Scroll through the robots.txt code to locate the highlighted syntax warnings and logic errors. The number of syntax warnings and logic errors is shown immediately below the editor.
- Type in an extension of the URL or path in the text box at the bottom of the page.
- Select the user-agent you want to simulate in the dropdown list to the right of the text box.
- Click the TEST next to the dropdown user-agent list to run the simulation.
- Check to see if TEST button now reads ACCEPTED or BLOCKED to find out if the URL you entered is blocked from Google web crawlers.
Note : If you want that your complete website should not be crawl by all search engines.
User-agent: *
Disallow: /
Disallow: /
No comments:
Post a Comment
Share your view about this post with me and please don't do spamming here !!