Comparison

crawler.sh vs Jina Reader

Single-URL cloud reader vs local crawler with full-site sweep and Markdown archive export.

Jina Reader (r.jina.ai) is a cloud endpoint that turns a single URL into Markdown with a short HTTP call. It is excellent for one-shot lookups inside an agent loop. crawler.sh is built for the next step: sweeping the whole site, deduping, and producing a Markdown corpus on disk. Both produce clean Markdown; they answer different questions.

Side by side

Honest comparison on the axes that decide which tool fits your workflow.

Primary unit of work
One URL at a time, returned as Markdown over HTTP.
A whole site or a section of one. BFS crawl, configurable depth and page cap.
Where it runs
Vendor cloud.
Local binary on your laptop or server.
Pricing model
Free tier with rate limits; paid usage billed by tokens consumed.
Free up to 1,000 pages per crawl. $99 a year unlocks 10,000 pages and Markdown archive export.
JavaScript rendering
Yes.
Custom JavaScript engine, no headless Chrome. Chrome 131 TLS fingerprint and shared cookie jar.
Bulk Markdown export
You call the endpoint per URL and assemble the corpus yourself.
One command exports the entire crawl as a Markdown archive with YAML frontmatter (url, title, captured_at, word_count, language).
robots.txt by default
Vendor controls.
On by default. --ignore-robots to opt out. Per-host adaptive backoff on 429 and 403.
Best fit
Agent loops that need one page at a time, or quick one-off URL-to-Markdown conversion.
Building a corpus from a whole site, repeat crawls on a schedule, or audits that need every page.
Data path
URLs go through vendor infrastructure.
Pages are fetched directly by your machine.
Desktop app
No.
Yes, plus the CLI.

Pricing and feature notes reflect publicly listed information at the time of writing.

When to pick which

Both tools solve real problems. Pick based on where the work actually runs and what you are billing against.

Pick Jina Reader when

You need to convert individual URLs to Markdown on demand from inside an agent or script, the volume is low to moderate, and the simplicity of a single HTTP endpoint outweighs the per-call cost. Jina Reader is genuinely fast at that shape of work.

Pick crawler.sh when

You need the whole site, not a URL. You want a Markdown directory on disk you can hand to a tokenizer or RAG indexer. You want crawls to run locally, with no per-token billing, no API key, and politeness built in.

Try the local-first path

Install in one command. Crawl any site into clean Markdown in seconds. Free up to 1,000 pages, $99 a year for 10,000.

Crawler.sh - Free Local AEO & SEO Spider and a Markdown content extractor | Product Hunt