What is the Meta Robots Tag in SEO

The meta robots tag is an HTML <meta> element placed in the <head> section of a web page. It provides instructions to search engine crawlers about how to treat that specific page. Unlike robots.txt, which controls crawling at the site or directory level, the meta robots tag controls indexing behavior for individual pages.

Every page on your site can have its own meta robots tag. This granular control lets you allow crawling of a page while preventing it from appearing in search results, or allow indexing while preventing crawlers from following links on that page.

Common meta robots directives

Directive	Meaning	Use case
`noindex`	Do not include this page in the search index	Thank you pages, admin panels, duplicate content
`nofollow`	Do not follow links on this page	User-generated content, untrusted links, login-required pages
`noarchive`	Do not show a cached copy of this page	Content that updates frequently and cached versions would be misleading
`nosnippet`	Do not show a text snippet or video preview	Pages where previews might reveal sensitive information
`noimageindex`	Do not index images on this page	Photo galleries where you want text indexed but not images
`notranslate`	Do not offer translation of this page	Pages with technical terminology that should not be machine-translated
`unavailable_after`	Remove this page from the index after a specified date	Time-limited offers, event pages, seasonal content

Multiple directives can be combined in a single tag:

<meta name="robots" content="noindex, nofollow">

This tells search engines not to index the page and not to follow any links found on it.

Targeting specific crawlers

The name attribute can target specific user agents rather than all crawlers:

<meta name="googlebot" content="noindex">
<meta name="bingbot" content="noindex">

This blocks only Google and Bing while allowing other crawlers to index the page. Use this sparingly, as most site owners want consistent behavior across all search engines.

Meta robots vs X-Robots-Tag

The same directives can be delivered via an HTTP header called X-Robots-Tag. This is useful for non-HTML files like PDFs, images, or video where a <meta> tag cannot be embedded. Both methods are equally valid, but X-Robots-Tag is less commonly used for HTML pages.

Example HTTP response header:

X-Robots-Tag: noindex, nofollow

You can also combine it with content-type-specific rules:

X-Robots-Tag: noindex
Content-Type: application/pdf

Common mistakes

Applying noindex to staging or development sites and forgetting to remove it before launch. This is one of the most common causes of a new site not appearing in search results.
Using noindex on important pages like product categories or blog posts. Always double-check which template applies the tag.
Conflicting directives between meta tags and HTTP headers. If the meta says index but the header says noindex, different crawlers may behave differently.
Assuming noindex also prevents crawling. It does not. Use robots.txt Disallow if you want to block crawling entirely.
Using nofollow site-wide, which blocks link equity flow throughout the site and prevents crawlers from discovering new pages.
Applying noindex to paginated series, which can prevent search engines from understanding the relationship between pages.
Forgetting that noindex eventually leads to nofollow behavior. If a page is not indexed, links from it may not pass equity even without an explicit nofollow.

Meta robots and SEO strategy

Strategic use of meta robots tags helps focus crawl budget on valuable pages:

Thin pages - Login forms, search results, tag archives with few posts
Duplicate content - Print-friendly versions, mobile-specific URLs, parameter-based sorting
Private content - Account dashboards, checkout flows, internal documentation
Temporary pages - Campaign landing pages, expired promotions, maintenance notices
Paginated content - Use rel="canonical" to the first page rather than noindex on subsequent pages

How crawler.sh checks meta robots tags

crawler.sh extracts and reports meta robots directives for every crawled page. The crawler seo command provides:

A count of pages with noindex directives
A list of noindex pages so you can audit them individually
Detection of nofollow at the page level
The raw meta robots content for each page in the crawl output
Identification of conflicting directives between meta tags and HTTP headers

This helps catch accidental blocking of content that should be indexed. The CSV export lets you sort and filter by directive type, making it easy to spot patterns like an entire directory that was accidentally noindexed.