Guides
March 6, 2026

How to Find Broken Links of a Website with CLI

Learn how to detect broken links and dead pages on any website using crawler.sh CLI. Crawl your site, identify 4xx/5xx errors, and export a report.

Mehmet Kose
4 mins read

Broken links hurt both user experience and SEO. When a visitor clicks a link that leads to a 404 page, they lose trust. When search engine crawlers hit dead ends, they waste crawl budget and may lower your rankings. The longer broken links stay on your site, the more damage they do.

This guide shows you how to find every broken link on a website using the crawler.sh CLI.

Step 1: Install crawler.sh CLI

Install the CLI with a single command:

curl -fsSL https://install.crawler.sh | sh

This downloads the correct binary for your operating system and architecture, places it in ~/.crawler/bin/, and adds it to your PATH. Restart your terminal or run source ~/.bashrc (or ~/.zshrc) to pick up the new PATH entry.

Verify the installation:

crawler --version

Step 2: Crawl the target website

Run a full crawl of the website you want to check for broken links:

crawler crawl https://example.com

The crawler follows every internal link it discovers, recording the HTTP status code for each page. Results are saved as an NDJSON file (.crawl) in the current directory. For larger sites, increase the page limit:

crawler crawl https://example.com --max-pages 5000

The crawl captures every page’s status code (200, 301, 404, 500, etc.), the URL that linked to it, and the depth at which it was found.

Step 3: Check the crawl summary

Use the info command to get a quick overview of status code distribution:

crawler info example-com.crawl

This displays a breakdown of all crawled pages by HTTP status code. Look for the 4xx and 5xx categories - these are your broken links and server errors. The summary also shows total pages crawled, average response times, and redirect statistics.

For a detailed report that flags every broken link with context, run the seo command:

crawler seo example-com.crawl

The SEO report checks for:

  • 404 Not Found: Pages that no longer exist, often caused by deleted content or changed URLs
  • 410 Gone: Pages explicitly marked as permanently removed
  • 5xx Server Errors: Pages that fail due to server-side issues
  • Broken internal links: Pages on your site that link to non-existent URLs

Each issue lists the broken URL along with the page that links to it, making it straightforward to locate and fix the source of the problem.

Export the results to a file for sharing with your team or tracking fixes:

crawler seo example-com.crawl --format csv --output broken-links.csv

You can also export as plain text:

crawler seo example-com.crawl --format txt --output broken-links.txt

The CSV format is ideal for importing into a spreadsheet where you can filter by status code, sort by source page, and assign fixes to team members.

Once you have your list of broken links, here is how to address them:

  • Set up redirects for moved content. If a page moved to a new URL, add a 301 redirect from the old URL to the new one.
  • Update internal links. Find every page that links to the broken URL and update the link to point to the correct destination.
  • Remove links to deleted content. If the content no longer exists and there is no replacement, remove the link entirely rather than leaving a dead end.
  • Fix server errors. 5xx errors indicate server-side problems. Check your application logs to identify and resolve the underlying issue.
  • Create a useful 404 page. While you fix broken links, make sure your 404 page helps visitors find what they were looking for with search or navigation links.
  • Re-crawl after fixing. Run crawler crawl again to confirm all broken links have been resolved.
Crawler.sh - Free Local AEO & SEO Spider and a Markdown content extractor | Product Hunt