Robots.txt and Robots Metatags

Telling a search engine to not index a page sounds relatively easy. According to the specs, all you need to do is implement the NOINDEX parameter in your Robots Metatag. Sounds like you’re all set. But you might not be.

Why? It turns out that if you have also used your Robots.txt file to tell Google to not crawl the pages you have “NOINDEX”ed that Google will ignore the metatags. This is because in Google’s algorithms Robots.txt takes precedence over the Robots metatag. Translation: the Robots Metatag is ignored if you exclude the crawling of the page via Robots.txt.

So even if the page is not crawled, your page can still be indexed if other pages link to it. So if you want to prevent a page from being indexed, use only the Robots Metatag, and you should be off to the races.

Leave a Reply

Your email address will not be published. Required fields are marked *

*