AI agent that explores a web application through a real browser and generates structured user documentation in Markdown — including screenshots.
Built for the conference talk "Eigene AI Agents bauen" (Head in the Cloud 2026) by Robert Dey (Dey AI Solutions). The agent uses Deep Agents JS for planning, filesystem tools, and context management, plus custom Playwright tools for browser control.
- Autonomous exploration — navigates, clicks, types, scrolls, reads the accessibility tree
- Structured output — multi-file Markdown under
/docs/, working notes in/findings/, screenshots in/screenshots/ - Planning — uses Deep Agents' built-in
write_todosfor progress tracking - Context management — filesystem offloading and auto-summarization via the framework
- Resume — SQLite checkpoints let you continue after timeout or interruption
- Observability — formatted console logging via
AgentLogger
- Node.js ≥ 24 (uses native
--env-file) - pnpm
- API key for Anthropic or OpenAI (including OpenAI-compatible endpoints)
pnpm install
# If better-sqlite3 native build was skipped, run: pnpm rebuild better-sqlite3
cp .env.example .env
# Edit .env with your TARGET_URL, LLM_API_KEY, etc.
pnpm build
pnpm startDocumentation is written to OUTPUT_DIR/<timestamp>/docs/.
To resume a run:
RESUME_SESSION_ID=<timestamp>Set LLM_PROVIDER=openai and point LLM_BASE_URL at any OpenAI-compatible API. Example:
LLM_PROVIDER=openai
LLM_BASE_URL=https://api.cortecs.ai/v1
LLM_MODEL=deepseek-v4-pro
LLM_API_KEY=your-api-key| Variable | Description | Default |
|---|---|---|
TARGET_URL |
Web app to document | (required) |
OUTPUT_DIR |
Base output directory | ./output |
LLM_PROVIDER |
anthropic or openai |
anthropic |
LLM_MODEL |
Model identifier | (required) |
LLM_API_KEY |
Provider API key | (required) |
LLM_BASE_URL |
Base URL for OpenAI-compatible APIs (with openai) |
— |
CREDENTIALS_NOTE |
Free-text login hints for the agent | — |
HEADLESS |
Run browser headless | false |
BROWSER_VIEWPORT_WIDTH |
Viewport width | 1280 |
BROWSER_VIEWPORT_HEIGHT |
Viewport height | 720 |
MAX_RUNTIME_SECONDS |
Max run duration | 300 |
DOC_LANGUAGE |
Documentation language | de |
RESUME_SESSION_ID |
Resume a previous run folder | — |
MAX_NAVIGATION_DEPTH |
Link depth limit (0 = unlimited) | — |
src/
cli.ts # Entry point
config.ts # Zod-validated environment config
agent/ # Deep Agent setup and run loop
browser/ # Playwright service and browser tools
logging/ # Console progress logging
run/ # Run directories and SQLite checkpointer
output/
└── 2026-05-26_14-30-00/
├── docs/
│ ├── index.md
│ └── 01-feature.md
├── screenshots/
├── findings/
└── .checkpoints/
pnpm lint
pnpm format
pnpm buildThe agent loop is provided by Deep Agents JS. Custom code adds:
- Browser tools —
navigate,click,type_text,scroll,screenshot,get_accessibility_tree,get_page_info - System prompt — filesystem conventions, exploration strategy, credentials
- FilesystemBackend — agent writes docs directly to the run directory
- SqliteSaver — persistent checkpoints for resume
See spec.md and prd.md for the full specification.
Consulting for applied AI. I help teams find the right use cases and ship autonomous agents that hold up in production, not just in the demo:
- Use-case analysis — what's worth building, and what isn't
- Autonomous agents — production-grade, with guardrails & observability
- Robust engineering — maintainable systems your team can run without us
- Agentic coding — getting your codebase and team ready for coding agents
Got an idea for an agent of your own? → deyai.solutions · hello@deyai.solutions