Matt Cutts has just put out a posting on how search engines handle the “noindex” metatag. He makes some interesting observations about the results as follows:
- Google doesn’t show the page in any way
- Ask doesn’t show the page in any way
- MSN shows a url reference and Cached link, but no snippet. Clicking the cached link doesn’t return anything.
- Yahoo! shows a url reference and Cached link, but no snippet. Clicking on the cached link returns the cached page.
Matt goes on to note that it would be great if all the search engines treated the noindex metatag, well, like a noindex request (translation, “DON’T index it” – my words, not Matt’s). I think that this is an excellent idea.
Danny Sullivan, still of Search Engine Watch, adds his thoughts to the noindex discussion. Danny observes that Google treats the robots.txt file in a way that many webmasters may not expect. In particular, Google may still index pages that are marked as “don’t crawl” in the robots.txt file, may still be indexed by Google.
The one case I have seen where this happens is when pages are marked in robots.txt as “disallow”, yet third party pages link to those pages. In the past, Google has reserved the right to still index these pages.
In fact, Google may still index a page even if it is marked with the noindex metatag. This can happen if it is also listed in the robots.txt file as “disallow”. The reason is that Google treats the robots.txt as being a higher priority than the noindex metatag, and ignores the metatag if the page is also called out in the robots.txt file.
Now it turns out that this is pretty simple to address. All you need to do is not use the robots.txt file, and rely solely on the noindex metatag. In this event, Google will reliably not index the page, in all cases (that I know of).
Danny repeats a call I have seen him make in the past for imporved standardization of search engine behavior. More standardization by the search engines sounds like another excellent idea. We all recognize that there are things that the search engines need to keep close to the vest, and treat as proprietary. But making life easier on SEOs and search engines by following agreed upon standards will benefit us all.
You can read more about metatags and SEO here.