How to Bootstrap a Documentation QA Bot with MCP
Have an agent crawl your docs site once, persist the Markdown, then answer support questions against it.
Support teams spend hours answering the same questions. If your documentation already covers the answers, an agent can do the lookup for you. With crawler-mcp, you can crawl your docs site once, store the Markdown, and let the agent answer questions by searching the cached content.
This guide shows how to build a documentation QA bot using crawl_site.
Step 1: Install crawler-mcp
Run the install script:
curl -fsSL https://install.crawler.sh/install-mcp.sh | shThis downloads the correct binary for your platform to ~/.crawler/bin/crawler-mcp.
For more detail, see the installation guide.
Step 2: Connect to a remote model via API
A chatbot needs a persistent backend, not a local IDE plugin. Use the MCP Python SDK or any HTTP client to bridge crawler-mcp with a remote LLM API.
Here is a minimal Python service that starts the MCP server, exposes the three tools, and forwards tool results to an LLM:
import asyncio, json, subprocessfrom mcp import ClientSession, StdioServerParametersfrom mcp.client.stdio import stdio_client
# Start crawler-mcpserver = StdioServerParameters( command="/Users/you/.crawler/bin/crawler-mcp", env={"CRAWLER_TOKEN": "your-token"},)
async def ask_llm(messages): # Replace with your provider: OpenAI, Anthropic, Gemini, etc. import openai response = await openai.chat.completions.create( model="gpt-4o", messages=messages, tools=tools, ) return response
async def main(): async with stdio_client(server) as (read, write): async with ClientSession(read, write) as session: await session.initialize() tools = await session.list_tools() # Pass tools to your LLM and let it decide which to callThe service runs crawler-mcp as a subprocess, reads its JSON-RPC stream, and translates tool calls between the LLM and the crawler. You can deploy this as a FastAPI endpoint, a Slack bot, or a Discord bot.
For a no-code start, wire crawler-mcp into Claude Desktop first, prove the workflow in chat, then port the same prompts to your API service.
Step 3: Crawl your docs site
Ask the agent to crawl your documentation:
Use crawler-sh to crawl https://docs.example.com to depth 3 with max_pages 200. Save the Markdown for each page to ./docs-cache/ as individual files.
The agent calls crawl_site, receives Markdown for every page, and writes each one to disk.
Step 4: Index the docs for search
Ask the agent to create a searchable index:
Read all files in ./docs-cache/, extract the title and first paragraph from each, and write an index.json with
{url, title, summary}for every page.
The agent scans the cached files and builds a lightweight index.
Step 5: Answer support questions
When a question comes in, ask the agent to search the cache:
Someone asked “How do I configure SSO?” Search the docs cache for relevant pages and give a step-by-step answer with links.
The agent reads the cached docs, finds the relevant sections, and returns a grounded answer with source URLs.