billrichards · billrichards · Jun 13, 2026 · Jun 3, 2026 · Jun 5, 2026 · Jun 5, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -16,6 +16,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+
+- **Multi-worker concurrency** — two layers, available identically across CLI, web, and the Python API:
+  - *Page-level*: `TestConfig.workers` (CLI `--workers`, web `workers` in the `POST /api/run` body) tests multiple pages of a single run in parallel, each worker driving its own browser/context. Defaults to `1` (sequential, unchanged behaviour); capped at 16. Authentication is performed once and replicated to every worker via Playwright `storage_state`.
+  - *Session-level*: `BatchRunner` (`from qa_agent import BatchRunner`) runs multiple independent sessions through a bounded thread pool. The CLI exposes it via `--batch-file`/`--pool-size`; the web server now uses it instead of an unbounded thread-per-job model (`QA_AGENT_JOB_POOL_SIZE`, default 4).
+- **Expanded public API** — `from qa_agent import QAAgent, TestConfig, BatchRunner, …` now re-exports the full public surface for library use.
+
 ## [0.2.3] - 2026-05-22
 
 ### Fixed

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,111 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Setup
+
+```bash
+pip install -e ".[dev,web,pdf]"
+playwright install chromium
+```
+
+The package must be installed (editable is fine) for `python -m qa_agent`, the
+`qa-agent`/`qa-agent-web` entry points, and subprocess-based tests (e.g.
+`tests/_cli_exit_helper.py`) to find the `qa_agent` module. If packaging tests
+fail with `ModuleNotFoundError: No module named 'qa_agent'` or version
+mismatches, run `pip install -e .` first — check with `pip show qa-agent`.
+
+## Commands
+
+```bash
+# Unit tests (fast, no browser)
+pytest -v -m "not integration and not network"
+
+# Single test
+pytest tests/test_agent.py::TestClassName::test_name -v
+
+# Integration tests (real Playwright against local fixture server)
+pytest -v -m integration --no-cov
+
+# Lint / format / type-check
+ruff check .
+ruff format .
+mypy qa_agent
+
+# Build
+rm -rf build/ dist/ && python -m build
+```
+
+Coverage is enforced at 70% via `--cov-fail-under=70` in `pyproject.toml`
+(applies to default `pytest` invocations). Running a small subset of tests
+without `--no-cov` will fail on the coverage gate even if the tests
+themselves pass — use `-p no:cacheprovider -o addopts=""` or `--no-cov` to
+bypass when checking a few tests in isolation.
+
+Integration tests serve fixtures from `tests/fixtures/test-target/` (a
+73-page HTML fixture site driven by `manifest.json`, which is the source of
+truth for parametrized integration tests — each entry maps a fixture file to
+an expected finding title/category). Start the fixture server manually for
+debugging:
+
+```bash
+cd tests/fixtures/test-target && python3 -m http.server 8181
+```
+
+## Architecture
+
+Request flow: `cli.py` parses args into a `TestConfig` (`config.py`) →
+if `--instructions`/`--instructions-file` is set, `ai_planner.py` calls
+`llm_client.py` (Anthropic/OpenAI via stdlib `urllib`, no SDK deps) to
+produce a `TestPlan`, cached on disk by `plan_cache.py` (24h TTL) →
+`agent.py` (`QAAgent`) launches Playwright, iterates/crawls target URLs, and
+runs each enabled tester from `testers/` against every page, collecting
+`Finding` objects → reporters in `reporters/` consume the resulting
+`TestSession` and write console/markdown/json/pdf output.
+
+- **Concurrency**: `concurrency.py` implements page-level worker pools
+  (`--workers`, max 16) within a single run, and `batch.py` (`BatchRunner`)
+  runs multiple independent `TestConfig` sessions concurrently with a bounded
+  pool (`--pool-size`/`--batch-file`, max 8). Total live browsers ≈
+  `pool_size × workers`.
+- **Rate limiting**: `rate_limiter.py` (`HostRateLimiter`) paces
+  `page.goto()` navigations per-hostname (`--rate-limit`, default 3 req/s,
+  `0` disables). One shared instance per `QAAgent` run covers all its
+  workers; `BatchRunner` can hold a single shared instance passed to every
+  `QAAgent` it constructs so concurrent batch jobs hitting the same host
+  share one budget.
+- **Testers** (`testers/`) all extend `BaseTester` (`testers/base.py`),
+  receive a Playwright `Page` + `TestConfig`, and return `list[Finding]`.
+  `custom.py` runs AI-generated steps from the cached `TestPlan`.
+  `wcag_compliance.py` is opt-in (`--wcag-compliance`) and excluded from
+  coverage.
+- **Reporters** (`reporters/`) all extend `BaseReporter` and consume a
+  `TestSession`; JSON is always written regardless of `--output` (web UI
+  relies on it for session discovery).
+- **Web UI** (`web/`): Flask app (`server.py`) with SSE streaming for live
+  run output; templates/static assets are in `web/templates/` and
+  `web/static/`. No auth — local/internal use only.
+- **Models** (`models.py`): `Finding`, `FindingCategory`, `Severity`,
+  `PageAnalysis`, `TestSession`, `TestPlan` — the shared data contracts
+  between testers, the agent, and reporters.
+
+### Adding a new tester
+
+1. New module in `testers/` extending `BaseTester`, implement `run() ->
+   list[Finding]`.
+2. Export from `testers/__init__.py`.
+3. Add a `test_*` bool to `TestConfig` (`config.py`).
+4. Wire into `agent.py` `_test_page()`.
+5. Add `--skip-*`/opt-in flag in `cli.py` if needed.
+6. Add tests in `tests/testers/`.
+
+### Severity levels
+
+`CRITICAL` (security/data loss) · `HIGH` (major usability blockers) ·
+`MEDIUM` (UX/accessibility) · `LOW` (minor/best-practice) · `INFO`.
+
+### Exit codes (CLI)
+
+`0` no critical/high findings · `1` critical/high findings found · `2` error
+during run · `130` interrupted (Ctrl+C). Covered by
+`tests/test_packaging.py::TestExitCodeSmoke` via `tests/_cli_exit_helper.py`.
diff --git a/README.md b/README.md
@@ -127,6 +127,25 @@ print(f"Pages tested:   {len(session.pages_tested)}")
 print(f"Total findings: {session.total_findings}")
 ```
 
+Set `workers` to test pages in parallel, and use `BatchRunner` to run several
+independent sessions concurrently with a bounded pool:
+
+```python
+from qa_agent import BatchRunner, TestConfig
+
+configs = [
+    TestConfig(urls=["https://example.com"], workers=4),
+    TestConfig(urls=["https://other.test"]),
+]
+
+with BatchRunner(pool_size=4) as runner:
+    for result in runner.run_all(configs):
+        if isinstance(result, Exception):
+            print(f"session failed: {result}")
+        else:
+            print(f"{result.session_id}: {result.total_findings} findings")
+```
+
 → [Full Python API Reference](https://github.com/billrichards/qa-agent/blob/main/docs/api-reference.md) — all classes, methods, and configuration options.
 
 ---
@@ -241,6 +260,50 @@ qa-agent --mode focused https://example.com   # default — test only given URLs
 qa-agent --mode explore https://example.com    # crawl and test discovered pages
 ```
 
+### Concurrency
+
+Test multiple pages in parallel with cooperating workers. Each worker drives its
+own browser, so memory and CPU scale with the worker count (capped at 16).
+
+```bash
+qa-agent --workers 4 --mode explore https://example.com   # 4 pages at a time
+```
+
+Run several independent sessions concurrently from a JSON spec file. Each entry
+needs `urls` plus optional per-run overrides (`mode`, `max_depth`, `max_pages`,
+`instructions`, `workers`); all other settings come from the command-line flags.
+
+```bash
+qa-agent --batch-file runs.json --pool-size 4
+```
+
+```json
+[
+  {"urls": ["https://example.com"], "mode": "explore", "workers": 4},
+  {"urls": ["https://other.test/login"], "instructions": "Check the checkout flow"}
+]
+```
+
+| Flag | Default | Description |
+|---|---|---|
+| `--workers N` | `1` | Concurrent page-workers per run (max 16) |
+| `--batch-file FILE` | — | JSON file of multiple runs to execute concurrently |
+| `--pool-size N` | `4` | Max concurrent runs for `--batch-file` (max 8) |
+| `--rate-limit N` | `3.0` | Max page navigations/sec to any single host (0 = unlimited) |
+
+> Total live browsers ≈ `pool-size × workers`, so size both with that
+> multiplicative cost in mind. The web API accepts the same `workers` value in
+> the `POST /api/run` body, and the pool size is set server-side via the
+> `QA_AGENT_JOB_POOL_SIZE` environment variable.
+
+By default, navigations to any single host are throttled to 3 requests/second
+across all workers and batch jobs, to avoid overwhelming dev/staging servers
+with "too many connections" when running with many concurrent browsers. Raise
+or disable this with `--rate-limit` (e.g. `--rate-limit 10` or `--rate-limit 0`
+for unlimited). The limit applies only to page navigations (`page.goto()`), not
+in-page interactions like clicks or form fills. The web server uses the same
+3 req/s default, overridable via the `QA_AGENT_RATE_LIMIT` environment variable.
+
 ### Exploration (explore mode)
 
 | Flag | Default | Description |

diff --git a/qa_agent/__init__.py b/qa_agent/__init__.py
@@ -12,3 +12,34 @@
 except PackageNotFoundError:
     # Package not installed (e.g. running from source without install)
     __version__ = "0.2.3"
+
+from .agent import QAAgent  # noqa: E402
+from .batch import BatchJob, BatchRunner  # noqa: E402
+from .config import (  # noqa: E402
+    AuthConfig,
+    OutputFormat,
+    RecordingConfig,
+    ScreenshotConfig,
+    TestConfig,
+    TestMode,
+)
+from .llm_client import LLMProvider  # noqa: E402
+from .models import Finding, PageAnalysis, Severity, TestSession  # noqa: E402
+
+__all__ = [
+    "QAAgent",
+    "BatchRunner",
+    "BatchJob",
+    "TestConfig",
+    "AuthConfig",
+    "ScreenshotConfig",
+    "RecordingConfig",
+    "TestMode",
+    "OutputFormat",
+    "LLMProvider",
+    "TestSession",
+    "PageAnalysis",
+    "Finding",
+    "Severity",
+    "__version__",
+]