noindex

Noindex is a directive that tells search engines not to include a page in their search results index.

Noindex is a directive that instructs search engines to exclude a page from their index. It can be set in two ways: as a meta tag in the HTML (<meta name="robots" content="noindex">) or as an HTTP header (X-Robots-Tag: noindex). When a search engine encounters a noindex directive, it will not show the page in search results.

When to use noindex

Noindex is appropriate for pages that should not appear in search results:

  • Login, registration, and account pages
  • Internal search results pages
  • Thank-you and confirmation pages
  • Admin and staging pages
  • Duplicate content that cannot be consolidated with canonical tags
  • PDF or document pages that should not rank

Risks of noindex

The most common problem with noindex is accidental application. A single misplaced noindex tag can remove a high-value page from search results. Common accidental causes include staging environment settings carried into production, CMS plugins that add noindex tags globally, and broad robots meta rules.

Unlike removing a page or blocking it in robots.txt, a noindex directive can take effect quickly. Search engines may drop the page from results within days of discovering the tag.

How crawler.sh helps

Run crawler crawl to detect noindex directives in both meta tags and HTTP headers across your entire site. The crawler seo command lists every noindexed page so you can verify each one is intentional. Regular audits help catch accidental noindex tags before they impact traffic.

Crawler.sh - Free Local AEO & SEO Spider and a Markdown content extractor | Product Hunt