Meta robots tag

What is the Meta Robots Tag in SEO

The meta robots tag is an HTML element that tells search engines how to crawl and index a specific page.

The meta robots tag is an HTML <meta> element placed in the <head> section of a web page. It provides instructions to search engine crawlers about how to treat that specific page. Unlike robots.txt, which controls crawling at the site or directory level, the meta robots tag controls indexing behavior for individual pages.

Every page on your site can have its own meta robots tag. This granular control lets you allow crawling of a page while preventing it from appearing in search results, or allow indexing while preventing crawlers from following links on that page.

Common meta robots directives

DirectiveMeaningUse case
noindexDo not include this page in the search indexThank you pages, admin panels, duplicate content
nofollowDo not follow links on this pageUser-generated content, untrusted links, login-required pages
noarchiveDo not show a cached copy of this pageContent that updates frequently and cached versions would be misleading
nosnippetDo not show a text snippet or video previewPages where previews might reveal sensitive information
noimageindexDo not index images on this pagePhoto galleries where you want text indexed but not images
notranslateDo not offer translation of this pagePages with technical terminology that should not be machine-translated
unavailable_afterRemove this page from the index after a specified dateTime-limited offers, event pages, seasonal content

Multiple directives can be combined in a single tag:

<meta name="robots" content="noindex, nofollow">

This tells search engines not to index the page and not to follow any links found on it.

Targeting specific crawlers

The name attribute can target specific user agents rather than all crawlers:

<meta name="googlebot" content="noindex">
<meta name="bingbot" content="noindex">

This blocks only Google and Bing while allowing other crawlers to index the page. Use this sparingly, as most site owners want consistent behavior across all search engines.

Meta robots vs X-Robots-Tag

The same directives can be delivered via an HTTP header called X-Robots-Tag. This is useful for non-HTML files like PDFs, images, or video where a <meta> tag cannot be embedded. Both methods are equally valid, but X-Robots-Tag is less commonly used for HTML pages.

Example HTTP response header:

X-Robots-Tag: noindex, nofollow

You can also combine it with content-type-specific rules:

X-Robots-Tag: noindex
Content-Type: application/pdf

Common mistakes

  • Applying noindex to staging or development sites and forgetting to remove it before launch. This is one of the most common causes of a new site not appearing in search results.
  • Using noindex on important pages like product categories or blog posts. Always double-check which template applies the tag.
  • Conflicting directives between meta tags and HTTP headers. If the meta says index but the header says noindex, different crawlers may behave differently.
  • Assuming noindex also prevents crawling. It does not. Use robots.txt Disallow if you want to block crawling entirely.
  • Using nofollow site-wide, which blocks link equity flow throughout the site and prevents crawlers from discovering new pages.
  • Applying noindex to paginated series, which can prevent search engines from understanding the relationship between pages.
  • Forgetting that noindex eventually leads to nofollow behavior. If a page is not indexed, links from it may not pass equity even without an explicit nofollow.

Meta robots and SEO strategy

Strategic use of meta robots tags helps focus crawl budget on valuable pages:

  • Thin pages - Login forms, search results, tag archives with few posts
  • Duplicate content - Print-friendly versions, mobile-specific URLs, parameter-based sorting
  • Private content - Account dashboards, checkout flows, internal documentation
  • Temporary pages - Campaign landing pages, expired promotions, maintenance notices
  • Paginated content - Use rel="canonical" to the first page rather than noindex on subsequent pages

How crawler.sh checks meta robots tags

crawler.sh extracts and reports meta robots directives for every crawled page. The crawler seo command provides:

  • A count of pages with noindex directives
  • A list of noindex pages so you can audit them individually
  • Detection of nofollow at the page level
  • The raw meta robots content for each page in the crawl output
  • Identification of conflicting directives between meta tags and HTTP headers

This helps catch accidental blocking of content that should be indexed. The CSV export lets you sort and filter by directive type, making it easy to spot patterns like an entire directory that was accidentally noindexed.

Crawler.sh - Free Local AEO & SEO Spider and a Markdown content extractor | Product Hunt