Setting Robots.txt in blog

This post is the second part of the article about Robots.txt Blogger consisting of two parts:

Introduction to Robots.txt Command and Function and Its Effect on SEO, for those who have not read, see it before reading this section.

Case Study and Use Robots.txt on Blogger, who was my friend read on this page

Case Study Robots.txt on Blogger

After my friend knows a few things about the robots.txt commands, I will try to present the few things in order melihatefektivitas robots.txtpada use Blogger.

Previously, I need to for the uninitiated, my friend can access this feature through the dashboard> settings> search preferences and see the bottom of the custom robots.txt. (Note: Do not do the editing before my friend knew exactly what to write).

As background, robots.txt is made automatically; contains default commands that have been set by Blogger. You can see robots.txt to access Google Webmaster Tools, and look at the sub-page "crawler access". Or more easily, see his file directly aja by adding a robots.txt file name behind the blog url. http://motyfunny.blogspot.com/robots.txt, eg http://motyfunny.blogspot.com/robots.txt. By default these command lines:

User-agent: Mediapartners-Google

Disallow:

User-agent: *

Disallow: /search

Allow:

Sitemap: http://buka-rahasia.blogspot.com/feeds/posts/default?orderby=updated

The first line of the group is owned adsense user agent; used as a command to allow the robot adsense crawler to index your website / blog. Its function is to determine the content of a web / blog, in order to show more relevant ads, and this is not the same / related to the Google search engine crawler. For those who use adsense on Blogger, tentun is beneficial and should not be removed. Continue for those who use adsense in WordPress (self hosted, not that the free wordpress.com, it may advertise mah ga, hehe), Joomla, Drupal, etc, add this command to facilitate the adsense crawler.

The second line is the command group for the entire crawler SE. We see there is a / search, a directory containing a Blogger label, which was better not being indexed.

And the third line is the sitemap (feed plus OrderBy command, contains a list of the latest updates), which helps speed up the indexing.

Case study I did a special on the label. Label to be "forbidden page" to be indexed because it is not a real page and can lead to duplication, the effect is certainly not good for SEO. This case also occurred on archive pages (archive).

Method

1. Using rel = nofollow on the label.

I eliminate the prohibition against labeling index (/ search), and re-use the rel = nofollow on the label, as I have said before.

2. Eliminating rel = nofollow on the label, and re-use the ban command in robots.txt 9mengembalikan labeling index in the original setting)

After some time and, as a result of the first method, I re-use the ban command labeling index, and remove the rel = nofollow tag.

3. Using the rel = nofollow tag on the index as well as the prohibition of robots.txt (Disallow: / search).

After getting the results of the latter, I use both nofollow and disallow command in robots.txt.

Results

The results of all three experiments the method is quite different:

1. By using rel = nofollow alone on the label, errors occurred in the Webmaster Tools still there, from about 90 errors when crawling, the reduction was not significant, only about 10-15 in a matter of weeks, that would not prevent a page of labels indexed.

2. With the ban on the use robots.txt, without the rel = nofollow, crawl fairly reduced error reduction, on top of the rest (about 65-80 labels), remaining around 30-40 just less than one week.

3. And finally, the use of both, the results are far too significant, so I wrote this result = 0! nothing else is a problem in the crawl errors in Webmaster Tools, and everything was just a short time.

Conclusion

As already presented in webmaster forums and by search engines like Google itself, using robots.txt not necessarily be directly prohibit the index. Even sometimes, "like-like I was ...," Google said. For example, my friend banned one page to be indexed, but that page has many backlinks (backlinks can from their own web pages / internal links, or other web / external link), then he will still be rendered with appropriate anchor text backlinks indexed by Google. Now with the use of two orders at once, at least we can eliminate the remnants of which still follow backlink on the blog page itself. Thus, if my friend put a link label on a particular page, and follow him, it is still considered by Google and enter the crawl errors. Even though we have used Disallow in robots txt. So, if you want to label are not indexed, to maximize SEO, also use rel = nofollow on the label. This assumption is of course also applies to the other pages where my friend does not want search engines to index (eg archives).

How to Edit and Custom Charge Robots.txt

a. As mentioned above, access via the robots.txt editor dashboard> settings> search preferences> crawlers and indexing> custom robots.txt, click edit.

b. Then click yes, fill in the robots.txt commands according to a buddy want, and save.

Important: If your friend does not have the need to ban the index on a particular page, do not do anything. If you want to restore to normal (default robots.txt from Blogger), return by selecting "no" and save.

Actually banning the index is very important, we can control the pages like labels (this is set by default), and archives, which obviously led to duplication of content. If my friend is not comfortable with using a robots.txt file to stop the index, I suggest using meta-index / content to avoid duplication as archives, how much more easily.

I see the use of robots.txt is combined with some other ways, for example using rel = nofollow, simply provide quality search results and traffic are much higher. And I hope as well as on the blog.

Setting Robots.txt in blog

1 komentar:

Ads

Label