JavaScript rendering is the process of executing JavaScript code on a web page to generate the final HTML that users see. Many modern websites ship a minimal HTML shell and use JavaScript to load content dynamically. Without JavaScript rendering, a crawler only sees the initial empty or incomplete HTML.
When you view a page in your browser, the JavaScript engine runs every script tag, fetches data from APIs, manipulates the DOM, and produces the finished page. A basic HTTP client that simply downloads the raw HTML skips this entire step. For sites built with React, Vue, Angular, or similar frameworks, this means the crawler misses almost all the actual content.
Static HTML vs rendered HTML
| Aspect | Static HTML | JavaScript-rendered HTML |
|---|---|---|
| Content source | Server sends complete HTML | Browser executes JS to build DOM |
| What crawlers see | Full content immediately | Empty shell or placeholder |
| Examples | Traditional blogs, docs sites | React, Vue, Angular apps |
| Crawling requirement | Standard HTTP request | JavaScript execution engine |
| Performance | Fast, low overhead | Slower, more memory |
Consider a typical React app. The raw HTML might look like this:
<div id="root"></div><script src="/app.js"></script>A crawler without JS rendering sees only the empty div. After rendering, that same div contains the full article, navigation, and footer.
Why JavaScript rendering matters for crawlers
When a crawler fetches a page, it receives the raw HTML from the server. If that HTML contains <script> tags that populate the page, the crawler must execute those scripts to access the actual content. This is critical for:
- Content extraction - Reading article text, product descriptions, and metadata
- Link discovery - Finding navigation links that are injected by JavaScript
- SEO analysis - Checking titles, descriptions, and headings that may be set by JS
- Single-page applications - Crawling apps where every route is rendered client-side
- Lazy-loaded content - Images, comments, or related articles that load on scroll
- Dynamic meta tags - Open Graph tags or canonical URLs set by JavaScript after page load
Search engines like Google can render JavaScript, but their rendering queue has limits. A page that relies heavily on JavaScript may be crawled less frequently or with outdated content if the rendering pipeline lags behind the initial crawl.
Approaches to JavaScript rendering
- Headless browsers - Full Chromium or WebKit engines that run the page like a real browser. These are accurate but resource-intensive. Each page consumes significant memory and CPU.
- Lightweight JS engines - QuickJS-based renderers that execute JavaScript without the full browser overhead. Faster and lighter, though they may lack some browser APIs.
- Hybrid approaches - Detecting whether a page needs JS rendering and applying it selectively. This avoids wasting resources on static pages while ensuring dynamic content is captured.
- Prerendering services - External services that render pages on demand and cache the result. Useful for large sites but add latency and cost.
When you need JavaScript rendering
You need JS rendering if your target site shows any of these characteristics:
- The raw HTML contains little or no visible text
- Content loads after an initial spinner or skeleton UI
- Navigation uses client-side routing without full page reloads
- Product listings or search results come from API calls
- Meta tags or canonical URLs are set by JavaScript
- The site uses a modern frontend framework
You can skip JS rendering if the site serves complete HTML server-side, as most blogs, documentation sites, and traditional CMS-driven pages do.
How crawler.sh handles JavaScript rendering
crawler.sh includes a built-in JavaScript rendering engine based on QuickJS. It executes JavaScript, builds the DOM, and extracts the rendered HTML. The engine implements browser APIs like document.querySelector, window.history, setTimeout, URL, Blob, and FileReader so that most scripts run without modification.
The crawler offers three modes:
- Auto-detect - The site profiler analyzes the first few pages and enables JS rendering only when needed. It looks for empty body shells, JavaScript framework markers, and script-to-text ratios.
- Always - JS rendering is applied to every page. Best when you know the entire site requires it.
- Never - Only the raw HTML is used. Fastest option for static sites.
This selective approach avoids the performance penalty of rendering static pages while ensuring dynamic content is captured. The profiler also sets an appropriate crawl posture for JavaScript-heavy sites, adding extra drain time and retry attempts to handle slower-loading content.