Desktop Reference
Settings
Section titled “Settings”The Settings card provides these parameters:
| Parameter | Range | Default | Description |
|---|---|---|---|
| Max Pages | 1 – 1,000 | 20 | Maximum number of pages to crawl |
| Max Depth | 1 – 50 | 10 | Maximum crawl depth from the start URL |
| Concurrency | 1 – 20 | 5 | Number of concurrent HTTP requests |
| Delay (ms) | 0 – 5,000 | 50 | Delay between requests in milliseconds |
| Extract Content | on / off | on | Enable content extraction to Markdown |
The desktop app also includes a Dark Mode toggle in the Settings card, persisted across sessions.
Page Data Model
Section titled “Page Data Model”Each crawled page contains the following fields, serialized to JSON in the output formats.
| Field | Type | Description |
|---|---|---|
url | String | Full URL of the page |
status_code | Number | HTTP status code |
content_type | String? | Content-Type header value |
title | String? | HTML <title> text |
meta_description | String? | <meta name="description"> content |
canonical_url | String? | <link rel="canonical"> href |
discovered_from | String? | Parent URL that linked to this page |
links_found | Number | Number of new same-domain links discovered |
depth | Number | Crawl depth from the start URL |
response_time_ms | Number | HTTP response time in milliseconds |
markdown | String? | Extracted content as Markdown (when enabled) |
word_count | Number? | Word count of extracted content |
byline | String? | Author byline from content extraction |
excerpt | String? | Article excerpt from content extraction |
meta_robots | String? | <meta name="robots"> content |
x_robots_tag | String? | X-Robots-Tag HTTP header value |
rel_next | String? | <link rel="next"> href (pagination) |
rel_prev | String? | <link rel="prev"> href (pagination) |
Fields marked with ? are optional and may be absent depending on the page.
Crawl Events
Section titled “Crawl Events”The crawler emits real-time events during a crawl session, used to display live progress in the dashboard.
| Event | Data | Description |
|---|---|---|
Started | url | Crawl has begun |
Discovered | url, depth | New URL found and enqueued |
PageCrawled | Page object | Page successfully fetched and parsed |
PageError | url, error | Failed to fetch a page |
Progress | crawled, total_discovered | Periodic progress update |
Completed | total_pages, total_errors | Crawl finished |
Export Formats
Section titled “Export Formats”JSON Archive
Section titled “JSON Archive”Complete crawl data as an array of page objects:
[ { "url": "https://example.com", "title": "Example", "meta_description": "An example site", "canonical_url": "https://example.com", "discovered_from": null, "status_code": 200, "markdown": "# Example\n\nContent here...", "word_count": 150, "byline": "Author Name", "excerpt": "A brief summary..." }]Content fields (markdown, word_count, byline, excerpt) are only included when present.
Sitemap XML
Section titled “Sitemap XML”Standard urlset document. Only includes pages with 2xx status codes. Special characters (&, <, >) are escaped in <loc> elements.
<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com</loc> <lastmod>2024-01-01</lastmod> </url></urlset>Content Archive (Premium)
Section titled “Content Archive (Premium)”Full page content as clean Markdown files bundled in a .zip archive.
SEO CSV
Section titled “SEO CSV”Two columns: Issue Type and URL. One row per affected URL, with values double-quote escaped.
"Issue Type","URL""Missing titles","https://example.com/page-1""Short content","https://example.com/page-2"SEO TXT
Section titled “SEO TXT”Human-readable report grouped by issue category with indented URLs.
Missing titles (2) https://example.com/page-1 https://example.com/page-2
Short content (1) https://example.com/page-3Filename Conventions
Section titled “Filename Conventions”Desktop exports follow the pattern:
{domain}-{label}-{timestamp}.{ext}- domain - hostname with dots replaced by dashes (e.g.,
example-com) - label - export type (
crawl-results,sitemap,seo-issues) - timestamp - ISO 8601 date/time formatted with dashes (e.g.,
2026-02-22-14-30-00)
System Requirements
Section titled “System Requirements”- macOS - Apple Silicon (M1/M2/M3/M4) or Intel
- The DMG is a universal binary that runs natively on both architectures