Thursday, November 24, 2011

My First Website - Using Sitemaps

When you create a new website it will be indexed as soon as the crawlers encounter a link to it. In general, you don't have to submit any information to search engines because the web is based on links. If there is a page then there must be also some links that point to this page. Otherwise the page is probably not worth indexing. However, there are situations where crawlers can not discover all the pages you have. This is where you can help them with sitemaps.


Sitemap is an XML file with defined structure. It contains information about all the pages of the website. All major search engines provide a way to submit sitemap files. Google, for example, uses Webmasters Tools as a place where you can submit sitemaps.


The main purpose of the sitemap is to inform search engines and crawlers about all the pages you have. Using this information bots, spiders and crawlers can visit pages even if there is no link pointing to them. But there is also another more important function. By submitting a sitemap file you can track how many pages are indexed by the search engine. This information is very important because in the case you find out that not all pages are indexed you may have a problem. In many cases pages with little or no content will not be indexed. This is normal. Less "normal" is when some article page is not indexed. In some cases you have to wait for a while, while in other cases the reason may be duplicate content or any other reason that makes search engine to think that this page is not worth indexing.


You can also add images, videos, code and geo information to the sitemap file. This way you can inform Google, Yahoo and Bing with all the stuff you think is worth indexing. More information about this possibility can be found at the Google Webmaster Central Blog.


Creating a sitemap file is pretty easy. All popular content management systems including Wordpress and Joomla have an extension which can automatically create a sitemap file from all the pages of the website. This way you don't have to worry about the file structure and links. There is also a possibility for websites that have no such automatic solution. There are many scripts that can crawl the site by following internal links. Using this information they create the file which can be submitted to search engines.


No comments:

Post a Comment