What is JavaScript Rendering in Web Crawling

JavaScript rendering is the process of executing JavaScript code on a web page to generate the final HTML that users see. Many modern websites ship a minimal HTML shell and use JavaScript to load content dynamically. Without JavaScript rendering, a crawler only sees the initial empty or incomplete HTML.

When you view a page in your browser, the JavaScript engine runs every script tag, fetches data from APIs, manipulates the DOM, and produces the finished page. A basic HTTP client that simply downloads the raw HTML skips this entire step. For sites built with React, Vue, Angular, or similar frameworks, this means the crawler misses almost all the actual content.

Static HTML vs rendered HTML

Aspect	Static HTML	JavaScript-rendered HTML
Content source	Server sends complete HTML	Browser executes JS to build DOM
What crawlers see	Full content immediately	Empty shell or placeholder
Examples	Traditional blogs, docs sites	React, Vue, Angular apps
Crawling requirement	Standard HTTP request	JavaScript execution engine
Performance	Fast, low overhead	Slower, more memory

Consider a typical React app. The raw HTML might look like this:

<div id="root"></div>
<script src="/app.js"></script>

A crawler without JS rendering sees only the empty div. After rendering, that same div contains the full article, navigation, and footer.

Why JavaScript rendering matters for crawlers

When a crawler fetches a page, it receives the raw HTML from the server. If that HTML contains <script> tags that populate the page, the crawler must execute those scripts to access the actual content. This is critical for:

Content extraction - Reading article text, product descriptions, and metadata
Link discovery - Finding navigation links that are injected by JavaScript
SEO analysis - Checking titles, descriptions, and headings that may be set by JS
Single-page applications - Crawling apps where every route is rendered client-side
Lazy-loaded content - Images, comments, or related articles that load on scroll
Dynamic meta tags - Open Graph tags or canonical URLs set by JavaScript after page load

Search engines like Google can render JavaScript, but their rendering queue has limits. A page that relies heavily on JavaScript may be crawled less frequently or with outdated content if the rendering pipeline lags behind the initial crawl.

Approaches to JavaScript rendering

Headless browsers - Full Chromium or WebKit engines that run the page like a real browser. These are accurate but resource-intensive. Each page consumes significant memory and CPU.
Lightweight JS engines - QuickJS-based renderers that execute JavaScript without the full browser overhead. Faster and lighter, though they may lack some browser APIs.
Hybrid approaches - Detecting whether a page needs JS rendering and applying it selectively. This avoids wasting resources on static pages while ensuring dynamic content is captured.
Prerendering services - External services that render pages on demand and cache the result. Useful for large sites but add latency and cost.

When you need JavaScript rendering

You need JS rendering if your target site shows any of these characteristics:

The raw HTML contains little or no visible text
Content loads after an initial spinner or skeleton UI
Navigation uses client-side routing without full page reloads
Product listings or search results come from API calls
Meta tags or canonical URLs are set by JavaScript
The site uses a modern frontend framework

You can skip JS rendering if the site serves complete HTML server-side, as most blogs, documentation sites, and traditional CMS-driven pages do.

How crawler.sh handles JavaScript rendering

crawler.sh includes a built-in JavaScript rendering engine based on QuickJS. It executes JavaScript, builds the DOM, and extracts the rendered HTML. The engine implements browser APIs like document.querySelector, window.history, setTimeout, URL, Blob, and FileReader so that most scripts run without modification.

The crawler offers three modes:

Auto-detect - The site profiler analyzes the first few pages and enables JS rendering only when needed. It looks for empty body shells, JavaScript framework markers, and script-to-text ratios.
Always - JS rendering is applied to every page. Best when you know the entire site requires it.
Never - Only the raw HTML is used. Fastest option for static sites.

This selective approach avoids the performance penalty of rendering static pages while ensuring dynamic content is captured. The profiler also sets an appropriate crawl posture for JavaScript-heavy sites, adding extra drain time and retry attempts to handle slower-loading content.