Desktop Reference

Settings

The Settings card provides these parameters:

Parameter	Range	Default	Description
Max Pages	1 – 1,000	20	Maximum number of pages to crawl
Max Depth	1 – 50	10	Maximum crawl depth from the start URL
Concurrency	1 – 20	5	Number of concurrent HTTP requests
Delay (ms)	0 – 5,000	50	Delay between requests in milliseconds
Extract Content	on / off	on	Enable content extraction to Markdown

The desktop app also includes a Dark Mode toggle in the Settings card, persisted across sessions.

Page Data Model

Each crawled page contains the following fields, serialized to JSON in the output formats.

Field	Type	Description
`url`	`String`	Full URL of the page
`status_code`	`Number`	HTTP status code
`content_type`	`String?`	Content-Type header value
`title`	`String?`	HTML `<title>` text
`meta_description`	`String?`	`<meta name="description">` content
`canonical_url`	`String?`	`<link rel="canonical">` href
`discovered_from`	`String?`	Parent URL that linked to this page
`links_found`	`Number`	Number of new same-domain links discovered
`depth`	`Number`	Crawl depth from the start URL
`response_time_ms`	`Number`	HTTP response time in milliseconds
`markdown`	`String?`	Extracted content as Markdown (when enabled)
`word_count`	`Number?`	Word count of extracted content
`byline`	`String?`	Author byline from content extraction
`excerpt`	`String?`	Article excerpt from content extraction
`meta_robots`	`String?`	`<meta name="robots">` content
`x_robots_tag`	`String?`	`X-Robots-Tag` HTTP header value
`rel_next`	`String?`	`<link rel="next">` href (pagination)
`rel_prev`	`String?`	`<link rel="prev">` href (pagination)

Fields marked with ? are optional and may be absent depending on the page.

Crawl Events

The crawler emits real-time events during a crawl session, used to display live progress in the dashboard.

Event	Data	Description
`Started`	`url`	Crawl has begun
`Discovered`	`url`, `depth`	New URL found and enqueued
`PageCrawled`	Page object	Page successfully fetched and parsed
`PageError`	`url`, `error`	Failed to fetch a page
`Progress`	`crawled`, `total_discovered`	Periodic progress update
`Completed`	`total_pages`, `total_errors`	Crawl finished

Export Formats

JSON Archive

Complete crawl data as an array of page objects:

[
  {
    "url": "https://example.com",
    "title": "Example",
    "meta_description": "An example site",
    "canonical_url": "https://example.com",
    "discovered_from": null,
    "status_code": 200,
    "markdown": "# Example\n\nContent here...",
    "word_count": 150,
    "byline": "Author Name",
    "excerpt": "A brief summary..."
  }
]

Content fields (markdown, word_count, byline, excerpt) are only included when present.

Sitemap XML

Standard urlset document. Only includes pages with 2xx status codes. Special characters (&, <, >) are escaped in <loc> elements.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com</loc>
    <lastmod>2024-01-01</lastmod>
  </url>
</urlset>

Content Archive (Premium)

Full page content as clean Markdown files bundled in a .zip archive.

SEO CSV

Two columns: Issue Type and URL. One row per affected URL, with values double-quote escaped.

"Issue Type","URL"
"Missing titles","https://example.com/page-1"
"Short content","https://example.com/page-2"

SEO TXT

Human-readable report grouped by issue category with indented URLs.

Missing titles (2)
  https://example.com/page-1
  https://example.com/page-2

Short content (1)
  https://example.com/page-3

Filename Conventions

Desktop exports follow the pattern:

{domain}-{label}-{timestamp}.{ext}

domain - hostname with dots replaced by dashes (e.g., example-com)
label - export type (crawl-results, sitemap, seo-issues)
timestamp - ISO 8601 date/time formatted with dashes (e.g., 2026-02-22-14-30-00)

System Requirements

macOS - Apple Silicon (M1/M2/M3/M4) or Intel
The DMG is a universal binary that runs natively on both architectures