crawl budget
Crawl budget is the number of pages a search engine will crawl on your site within a given timeframe.
Crawl budget is the number of pages a search engine will crawl on your site within a given timeframe. It is determined by two factors: crawl rate limit (how fast the search engine can crawl without overloading your server) and crawl demand (how much the search engine wants to crawl based on page importance and freshness).
Why crawl budget matters
For small sites with a few hundred pages, crawl budget is rarely a concern. Search engines can easily crawl every page. For larger sites with thousands or tens of thousands of pages, crawl budget becomes critical. If search engines spend their budget on low-value pages (duplicate content, parameter URLs, or error pages), important pages may not get crawled or indexed.
Wasted crawl budget means slower indexing of new content, delayed recognition of updates, and potential ranking issues for pages that search engines visit infrequently.
Factors that waste crawl budget
- Redirect chains that force multiple requests per page
- Duplicate content accessible at multiple URLs
- Soft 404 pages that return 200 status codes
- Infinite URL spaces from faceted navigation or calendars
- Orphan pages with no internal links pointing to them
How crawler.sh helps
Run crawler crawl to map your site structure and identify pages that may be wasting crawl budget. Use crawler seo to flag redirect chains, duplicate content, and other issues. The crawler info command shows your status code distribution so you can spot error pages consuming crawl resources.