A noindex directive tells search engines not to include a page in their index. While this is useful for pages like login screens or internal search results, accidentally noindexing important pages makes them completely invisible in search results. A single misplaced noindex tag can remove a high-traffic page from Google overnight.

This guide shows you how to find every noindexed page on your website using the crawler.sh CLI.

Step 1: Install crawler.sh CLI

Install the CLI with a single command:

curl -fsSL https://install.crawler.sh | sh

This downloads the correct binary for your operating system and architecture, places it in ~/.crawler/bin/, and adds it to your PATH. Restart your terminal or run source ~/.bashrc (or ~/.zshrc) to pick up the new PATH entry.

Verify the installation:

crawler --version

Step 2: Crawl the target website

Run a full crawl of the website you want to audit:

crawler crawl https://example.com

The crawler checks both the <meta name="robots"> tag and the X-Robots-Tag HTTP header for noindex directives. Results are saved as an NDJSON file (.crawl) in the current directory.

Step 3: Run SEO audit

Run the SEO analysis on your crawl data:

crawler seo example-com.crawl

The noindex pages check flags every page that contains a noindex directive, whether in the HTML meta tag or the HTTP header.

Step 4: Identify noindex pages

Look for the Noindex Pages section in the SEO report. Review each flagged page to determine if the noindex is intentional or accidental. Common causes of accidental noindexing:

Staging environment settings left in place after going live
CMS “discourage search engines” checkbox forgotten after development
Plugin or theme updates that reset indexing settings
Blanket noindex rules in robots meta that are too broad
A/B testing tools that add noindex to test variants

Step 5: Fix and re-crawl

For each flagged page:

Remove noindex from pages that should appear in search results
Keep noindex on pages that should not be indexed (login, admin, thank-you pages, duplicate content)
Check both sources - the meta tag in HTML and the X-Robots-Tag HTTP header

After fixing, re-crawl to verify:

crawler crawl https://example.com
crawler seo example-com.crawl

Why noindex pages matter for SEO

A noindex directive is absolute - if it is present, the page will not appear in search results regardless of its content quality or backlinks. This makes accidental noindexing one of the most damaging SEO mistakes. Regular audits catch these issues before they impact traffic. Even intentionally noindexed pages are worth reviewing periodically, as business needs change and pages that were once internal may now be valuable to index.