Need help with this?
Purchase All in One SEO Pack Pro to get dedicated support from us.
One of the questions we are most often asked is what the difference is between the NOINDEX robots meta tag and the robots.txt, and when each should be used. This article addresses this question.
The NOINDEX robots meta tag
The NOINDEX tag is used to prevent content from appearing in search results. The NOINDEX meta tag appears in the source code of your content and instructs search engine crawlers as they crawl your site. When a search engine sees the NOINDEX meta tag, it will never include that content in search results.
The NOINDEX robots meta tag looks like this in your page source code:
<meta name="robots" content="noindex,follow" />
The robots.txt file
The robots.txt file tells search engine crawlers where they can and cannot go on a website. It includes “Allow” and “Disallow” directives that guide a search engine as to which directories and files it should or should not crawl. However, it does not stop your content from being listed in search results.
An example of how you’d use the robots.txt file is to instruct search engines not to crawl the “/cgi-bin/” directory that may exist on your server, because there’s nothing in the directory that is of use to search engines.
The default robots.txt for WordPress looks like this:
User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php
The difference between NOINDEX and robots.txt
The difference between the two is as follows:
- The robots.txt file is used to guide a search engine as to which directories and files it should crawl. It does not stop content from being indexed and listed in search results.
- The NOINDEX robots meta tag tells search engines not to include content in search results and, if the content has already been indexed before, then they should drop the content entirely. It does not stop search engines from crawling content.
The biggest difference to understand is that if you want search engines to not include content in search results, then you MUST use the NOINDEX tag and you MUST allow search engines to crawl the content. If search engines CANNOT crawl the content then they CANNOT see the NOINDEX meta tag and therefore CANNOT exclude the content from search results.
So if you want content not to be included in search results, then use NOINDEX. If you want to stop search engines crawling a directory on your server because it contains nothing they need to see, then use “Disallow” directive in your robots.txt file.