What To Expect When MSNBot Becomes BingBot

It has been officially announced that on the 1st day of October, 2010 the veteran MSNBot will be retired forever by Bing. The change of name for the user agent will be made to reflect the current Microsoft search brand, Bing. Instead of the old web crawler that currently shows up on log statistics as, “Mozilla/5.0 (compatible; msnbot/2.0b (+http://search.msn.com/msnbot.htm)”, the new one will be: “Mozilla/5.0 (compatible; bingbot/2.0 +http://www.bing.com/bingbot.htm)”. And the HTTP header “From” field will also be changed to “From: bingbot(at)microsoft.com”. That means that “From: msnbot(at)microsoft.com” will no longer exist.

Rick DeJarnette of the Bing Crawling and Indexing Team stated on their Webmaster blog that the crawler is undergoing some undisclosed improvements: http://www.bing.com/community/blogs/webmaster/archive/2010/06/28/bing-crawler-bingbot-on-the-horizon.aspx. And we expect this to not just be a mere change of name, but a true improved version of the search engine spider.

Hints of what to expect from BingBot:

  • Bingbot will honor the directives stipulated by webmasters in robots.txt files for msnbot. No one will be required to make any change to reflect the new search engine bot.
  • The beta tag which MSNBot presently carries will be dropped. By October 1st the user agent will show the Internet world that it has undergone a lot of vigorous tests to ensure its stability.
  • If a webmaster gives separate directives to BingBot and MSNBot, only the directive(s) for BingBot will be adhered to.

A typical example of such directives is this:
User-agent: bingbot
Disallow: /mail/

User-agent: msnbot
Disallow: /mail/
Disallow: /admin/

In this type of scenario, BingBot will crawl every other directory on the site, including the “admin” directory. However, it will not crawl the “mail” directory, because a directive specifically restricts it from doing so. As you can see, the directives set for the BingBot overrides the ones set for the retiring user agent in the same robots.txt file. This rule is implemented to avoid any form of conflict.

The areas webmasters expect BingBot to improve upon:
Several webmasters I exchange ideas with hope that some aspects of Bing crawling and indexing can use some improvements.

  • Fast indexing of new websites. MSNBot is a very slow spider and generally takes between three (3) to four (4) months to index new sites. This is a direct contrast to GoogleBot and Yahoo Slurp, which index new websites in less than two (2) weeks. Since Bing will start powering Yahoo search sometime in 2011, we expect faster indexing from its crawler. This will help it to compete favorably with Google.
  • Bing has a very small indexing capacity when compared to Google and Yahoo. For its search alliance with Yahoo to bring good results, webmasters expect it to increase its indexing capacity. Doing this will help it to be able to index almost all the well-optimized web pages of websites.
  • Bing search results when compared to those of Google lack relevance in some ways. It has to bring better relevant results on SERPs to be able to break the near monopoly controlled by Google. This is an area where BingBot needs improvement.
  • MSNBot is a resource hog. It consumes a lot of bandwidth while crawling websites, and this has some consequences for sites hosted on shared server environments. Hopefully the incoming BingBot will correct this issue.

The Bing Crawling and Indexing Team welcome feedback, positive criticisms, opinions and ideas from the general public. If you have inquiries to make or any suggestion that would help to improve the new search engine spider, BingBot, don’t hesitate to contact the team through the following email address: bingbot@microsoft.com.