How to Find Broken Links of a Website with CLI
Learn how to detect broken links and dead pages on any website using crawler.sh CLI. Crawl your site, identify 4xx/5xx errors, and export a report.
Broken links hurt both user experience and SEO. When a visitor clicks a link that leads to a 404 page, they lose trust. When search engine crawlers hit dead ends, they waste crawl budget and may lower your rankings. The longer broken links stay on your site, the more damage they do.
This guide shows you how to find every broken link on a website using the crawler.sh CLI.
Step 1: Install crawler.sh CLI
Install the CLI with a single command:
curl -fsSL https://install.crawler.sh | shThis downloads the correct binary for your operating system and architecture, places it in ~/.crawler/bin/, and adds it to your PATH. Restart your terminal or run source ~/.bashrc (or ~/.zshrc) to pick up the new PATH entry.
Verify the installation:
crawler --versionStep 2: Crawl the target website
Run a full crawl of the website you want to check for broken links:
crawler crawl https://example.comThe crawler follows every internal link it discovers, recording the HTTP status code for each page. Results are saved as an NDJSON file (.crawl) in the current directory. For larger sites, increase the page limit:
crawler crawl https://example.com --max-pages 5000The crawl captures every page’s status code (200, 301, 404, 500, etc.), the URL that linked to it, and the depth at which it was found.
Step 3: Check the crawl summary
Use the info command to get a quick overview of status code distribution:
crawler info example-com.crawlThis displays a breakdown of all crawled pages by HTTP status code. Look for the 4xx and 5xx categories - these are your broken links and server errors. The summary also shows total pages crawled, average response times, and redirect statistics.
Step 4: Run SEO analysis for broken links
For a detailed report that flags every broken link with context, run the seo command:
crawler seo example-com.crawlThe SEO report checks for:
- 404 Not Found: Pages that no longer exist, often caused by deleted content or changed URLs
- 410 Gone: Pages explicitly marked as permanently removed
- 5xx Server Errors: Pages that fail due to server-side issues
- Broken internal links: Pages on your site that link to non-existent URLs
Each issue lists the broken URL along with the page that links to it, making it straightforward to locate and fix the source of the problem.
Step 5: Export the broken links report
Export the results to a file for sharing with your team or tracking fixes:
crawler seo example-com.crawl --format csv --output broken-links.csvYou can also export as plain text:
crawler seo example-com.crawl --format txt --output broken-links.txtThe CSV format is ideal for importing into a spreadsheet where you can filter by status code, sort by source page, and assign fixes to team members.
How to fix broken links
Once you have your list of broken links, here is how to address them:
- Set up redirects for moved content. If a page moved to a new URL, add a 301 redirect from the old URL to the new one.
- Update internal links. Find every page that links to the broken URL and update the link to point to the correct destination.
- Remove links to deleted content. If the content no longer exists and there is no replacement, remove the link entirely rather than leaving a dead end.
- Fix server errors. 5xx errors indicate server-side problems. Check your application logs to identify and resolve the underlying issue.
- Create a useful 404 page. While you fix broken links, make sure your 404 page helps visitors find what they were looking for with search or navigation links.
- Re-crawl after fixing. Run
crawler crawlagain to confirm all broken links have been resolved.