A sitemap is essentially a list of all the important URLs of a website that a webmaster wants search engines to index. Though major search engine spiders are able to locate all web pages of a site step-by-step by following links, but that process may take a longer time for some deeply-hidden pages to be found. And this is also true for sites that have some unreachable pages by internal and/or external links. So, a sitemap helps to speed up the process of finding all your pages or even make it possible for web crawlers in the first place.
There are three (3) types of sitemaps, namely the XML sitemap, the HTML sitemap and URL List.
The XML (Xtensible Markup Language) sitemap is specifically created to help search bots to easily discover a site’s inner web pages. An XML sitemap format may contain additional data that allow a webmaster to give extra information about any single page. For example:
- <?xml version=”1.0″ encoding=”UTF-8″?>
The loc tags (<loc></loc>) are required in every XML sitemap, as they’re used to identify the very location of the sitemap.
The priority tags (<priority></priority>) are optional, and are used to let bots know how important the corresponding file update is.
Changefreq tags (<changefreq></changefreq>) are also optional, and they help to tell search spiders the frequency with which the content of a corresponding file changes. Such changes can be hourly, daily, weekly or monthly. Having changefreq tags enable search engines to expect when the page’s content will likely change.
The lastmod tags (<lastmod></lastmod>), also optional, tell search engine bots when the corresponding file was last modified.
It is noteworthy that search engines do not perceive the lastmod, priority and changefreq tags as commands, but as citations. If you must add them to your site’s XML sitemap, make sure the information you provide is accurate, to avoid the sitemap being entirely neglected by these bots. So, inaccurate data in these tags incurs a negative effect you may not want your site to have.
An HTML sitemap is an ordinary HTML web page designed for both bots and humans to help them to easily locate all the web pages of a site in one place. It is a hierarchical visual model of the site’s most important web pages, which the webmaster wants search spiders and visitors to visit. An HTML site map is a kind of diagram which helps users to effortlessly navigate websites that have numerous pages and often confusing navigational issues. Being very similar to a book’s table of contents, it makes it easier for users to quickly find the information they’re interested in, without having to navigate through the entire site. If an HTML sitemap is well constructed and contains links that properly mirror the site structure, visitors won’t feel lost during navigation, but will easily understand the organization of the website, and to search for information accurately and quickly. In search engine optimization, an HTML sitemap makes it easier for a search engine bots to locate all the web pages of a website in one place.
URL List is a plain text file that lists the URLs of a website. This form of sitemap is supported by Yahoo Search, and it’s worth having in place.
All the three (3) sitemap formats, HTML, URL List and XML, play important roles in search engine optimization. However, the XML format is required by all major search engines and hence, it’s the one that carries greater weight.