Sitemaps – How Important Are They?

Google introduced Sitemaps back in 2005 in order to give site owners the ability to inform them of the pages within their site that were available for crawling or had been modified and needed re-visiting. Before Sitemaps, search engines required you to manually submit your site to them relying on their ability to crawl and discover the entire site in order for indexing. This had its short-falls, as some sites are hugely dynamic and therefore it was difficult for Google to find all the pages on the site and keep their cached versions up to date.

A Sitemap is configured in the form of an XML file, and its primary use is to list all the webpages you want to be indexed by the search engines. However they can be used to encase more information such as when a page was last updated, its importance and how often it changes. This can be vital for SEO purposes as it informs Google how often to re-visit a site and shows it which pages have changed. In order to specify which pages you do not want indexed you must do this in the robots.txt.

Sitemaps are particularly useful for larger websites that may contain a large collection of categories with many internal links that delve deep into the architectural structure of the site that the spider may overlook. By having a Sitemap, the Google robots can basically arrive to your site and know exactly what it needs to look at and where to find it. Don’t forget, in some cases the Google spiders/robots may not visit your site regularly so make the most of it when they are there. The less time the robots are looking for pages on your site, the more time they can spend retrieving useful information and indexing the necessary pages.

Image Credit: http://www.dataplex-systems.com/SiteMap.html

In some cases a Sitemap will not make a lot of difference to the discovery and indexing of your site if it is small and does not contain a lot of dynamic pages. It most certainly will not do any harm though and it is good practice to submit a Sitemap to the search engines.

A Sitemaps location is very important as it determines what can be included in the Sitemap. For example, if the Sitemap is located at http://example.com/images/sitemap.xml, then the Sitemap can include all webpages starting with http://example.com/images/. If the Sitemap is located at http://example.com/sitemap.xml, then it can include all pages on the host. Note that if it is located here, then it cannot include pages from http://subdomain.example.com/. Therefore it is highly recommended that you place your Sitemap on your root domain, unless you require different Sitemaps for different paths. It can also be included in the robots.txt in order for the crawlers to find it easily.

XML files should be UTF-8 encoded and this can usually be done when saving the document. Below are listed the valid XML tags used to make a working Sitemap accompanied by a definition to explain what each one does.

XML Tag Definitions

Attribute                 Description

<urlset>      required This is the root element of a sitemap file
                          and contains all individual <url> tags.

<url>         required    This is the storage container for an individual
                          URL within a Sitemap, and also contains the
                          <loc>, <lastmod>,<changefreq> and <priority>
                          tags.

<loc>         required    This contains the location of the URL on a sitemap.

<lastmod>     optional    This tag stores the date that the page was last
                          modified. Formatted as such: YYYY-MM-DDT00:00+00:0.
                          With or without the ‘T00:00+00:0’ where the ‘+00:0’
                          denotes the Time Zone Difference.

<changefreq>  optional    This element acts as an indicator to the search
                          engine for how often the page in question is
                          updated. Note that it does not have any influence
                          on the search engines likeliness to crawl your
                          page. Valid values are:
                            - always
                            - hourly
                            - daily
                            - weekly
                            - monthly
                            - yearly
                            - never

<priority>    optional    This value gives the search engines an idea as to
                          how important you think this page is relative to
                          other pages on your site. The default value if this
                          is not set is 0.5 and valid values can range from
                          0.0-1.0. Note that this serves only as a hint to
                          the search engine as to how you value the page in
                          question, and will not affect the performance of
                          a page in the search engine results.

Now that you know what the tags are, below is an example of what a Sitemap will look like when these tags are put into practice.

Example XML Sitemap

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.example.com/</loc>
      <lastmod>2010-12-23</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.8</priority>
   </url>
   <url>
      <loc>http://www.example.com/blog/</loc>
      <changefreq>weekly</changefreq>
   </url>
   <url>
      <loc>http://www.example.com/articles/2009/</loc>
      <lastmod>2009-12-29</lastmod>
      <changefreq>never</changefreq>
   </url>
   <url>
      <loc>http://www.example.com/directory/</loc>
      <lastmod>2010-11-15T18:00:15+00:00</lastmod>
      <priority>0.3</priority>
   </url>
   <url>
      <loc>http://www.example.com/about </loc>
      <lastmod>2009-10-28</lastmod>
   </url>
</urlset>

Whilst the Google bot is extremely intelligent, and will probably cope without you having a Sitemap, we recommend you create and submit one anyway. Firstly, it is good practice and is necessary if the website is continuously changing and growing. Secondly, if you manually submit it using the Google webmaster tools, you will know if your Sitemap was processed without any problems and you can get additional statistics about your site.

You can run a Sitemap generator which will create your Sitemap by crawling your site and providing you with a downloadable file which you can then put in your root folder on your server before uploading to Google. Check one out here.

Have fun, and Merry Sitemapping and a Happy New Year!

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!

Get Adobe Flash playerPlugin by wpburn.com wordpress themes