6 Facts Why You Should Include Robot Txt In Blog For Huge Traffic

window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag(‘js’, new Date()); gtag(‘config’, ‘UA-72524586-5’);

                                                         

robots.txt
robots.txt

Robot.txt is  a  simple  code  of  text saved in  the  Google Server  which  helps  the ‘Web Crawler’ to  index  your entire blog  and show your  blog  in  the search  results. With the help of this  ‘text file’ you can restrict’Web Crawler’ not to index some of  your unimportant pages. The robots.txt file includes instruction to the ‘Web Crawler’ which pages of your blog are to be indexed and which pages are not.


All blogs have a default robots.txt file as shown below :-


User-agent:  Mediapartners-Google
Disallow:
User-agent:*
Disallow: /search
Allow: /
Sitemap: http://example.blogspot.com/feeds/posts/default?orderby:=UPDATED


When we include this code in blog (Sitemap: http://example.blogspot.com/feeds/posts/default?orderby:=UPDATED) it helps ‘Web Crawler’ to index  our blog  faster and  easily. When you attempt to enter this code by going to the ‘Settings’ page you will find a warning there. It says :-

“Warning! Use with caution. Incorrect use of these features can result your blog being ignored by search engines”. 


So, one has to be very careful while adding this code to the blog. Follow the following steps :-

1) Log in to your Dashboard

2) Click : Settings

3) Click : Search preferences

4) Now under ‘Crawlers and indexing’

5) Against ‘Custom robots.txt’ click ‘Edit’

6) Now click ‘Yes’ and a small box opens

7) Now copy the code given above in bold & italic and paste it in    the small box

8) Now replace ‘example.blogspot.com‘ with the url of your blog. And you have done it. 



Explanation of the robots.txt file :


 i) User-agent: Mediapartners-Google

     This code is meant for ‘Google Web Crawler’ which helps them to serve ads related to the blog. This code should not be disturbed.

ii) User-agent:*
    Disallow: /search

            This code is an instruction to the ‘Web Crawler’ that the label links are not to be indexed. As a default settings a blog’s lebel links are never indexed by the crawler. If we remove this code the crawler will index the entire blog including lebel links. This code should not be disturbed as well.


iii)Allow:/

             This code instructs the crawler to index our entire ‘home page’.



iv) Sitemap: http://example.blogspot.com/feeds/posts/default?orderby:=UPDATED


     This is the sitemap of a blog which includes all our published posts/articles. By adding this code in the blog one can easily improve the crawling rate. Whenever the ‘Web crawler’ starts indexing the pages of a blog it will be easier for the crawler to index all the published posts of the blog. 
(adsbygoogle = window.adsbygoogle || []).push({});


Now to check if it has been done correctly, just copy the sitemap url paste it in another window and hit enter, you will find xml file of all your posts/articles.