- Purpose-built documents – Produces executive-ready integration playbooks, migration guides, or architecture documents that stay coherent for 60+ pages.
- Stakeholder friendly – Interactive intake questions capture goals, constraints, and tone before planning; multiple review cycles give teams a human-in-the-loop checkpoint.
- Diagram-rich storytelling – PlantUML/Mermaid diagrams are authored automatically, rendered to PNG/SVG, and embedded in Markdown, PDF, and DOCX outputs.
- Transparent progress – Every stage (plan, write, review, verify, rewrite, diagram, finalize) emits timeline events so product owners always know what’s happening.
- Cloud native – Runs entirely on Azure with Service Bus queues, Blob Storage, Application Insights telemetry, and container-ready packaging for easy operations.
- Polished final artifacts – Final Markdown/PDF/DOCX begin with an autogenerated title page, include auto-numbered headings, and preserve in-document hyperlinks.
- Automatic diagram handling – Diagram prep/render stages publish queued/start/done/failed status updates while converting PlantUML/Mermaid blocks into PNG/SVG assets sized appropriately for both PDF and DOCX.
- PlantUML-savvy authoring – The writer agent references an internal catalog of supported PlantUML diagram families with example syntax to avoid unsupported constructs.
- Token transparency across review cycles – The UI aggregates tokens for all REVIEW/VERIFY/REWRITE iterations so you see total review consumption instead of just the latest pass.
- Responsive PDF exports – WeasyPrint rendering now applies responsive CSS so diagrams scale within page margins; DOCX exports clamp diagram width for similar fidelity.
- Mermaid rendering fallback – Mermaid blocks are rendered through Kroki and cached under
jobs/<job_id>/images/diagram_*.pngfor reliable embedding. - Structured storage layout – Jobs store drafts, cycle reviews, diagram sources, and final artifacts in predictable Blob paths to simplify debugging and downstream automations.
- Generates long, consistent Markdown documents (>60 pages) with Mermaid and PlantUML diagrams rendered into PNG/SVG.
- Agentic pipeline: Planner (o3), Writer (gpt-5.2), Reviewer set (o3: general + style + cohesion + executive summary).
- Queue-driven architecture on Azure Service Bus; artifacts stored in Azure Blob Storage.
- Dedicated diagram-prep/render stages ensure graphics exist before finalize/PDF/DOCX export and surface progress in the timeline UI.
- REST-first workflow: jobs are created and monitored via FastAPI; Azure Container Apps Jobs host each worker stage.
- Interactive intake: collects detailed requirements before planning for higher quality output.
- Prerequisites
- Python 3.10+, Node.js 18+, Azure CLI.
- OpenAI/Azure OpenAI models: planner/reviewer/writer default to
gpt-5.2/gpt-4.1; PlantUML reformat model defaults togpt-5. - WeasyPrint native deps (Cairo, Pango) for PDF export: https://weasyprint.readthedocs.io/en/stable/install/.
- Install dependencies
python -m venv .venv && source .venv/bin/activate pip install -e .[dev] # builds/installs docwriter library npm install --prefix ui
- Configure environment variables
export OPENAI_API_KEY=... export OPENAI_BASE_URL=... export OPENAI_API_VERSION=... export SERVICE_BUS_NAMESPACE=<service-bus-namespace-name> export SERVICE_BUS_QUEUE_PLAN_INTAKE=docwriter-plan-intake export SERVICE_BUS_QUEUE_INTAKE_RESUME=docwriter-intake-resume export SERVICE_BUS_QUEUE_PLAN=docwriter-plan export SERVICE_BUS_QUEUE_WRITE=docwriter-write export SERVICE_BUS_QUEUE_REVIEW=docwriter-review # legacy alias for general review export SERVICE_BUS_QUEUE_REVIEW_GENERAL=docwriter-review # general reviewer export SERVICE_BUS_QUEUE_REVIEW_STYLE=docwriter-review-style # style reviewer export SERVICE_BUS_QUEUE_REVIEW_COHESION=docwriter-review-cohesion export SERVICE_BUS_QUEUE_REVIEW_SUMMARY=docwriter-review-summary export SERVICE_BUS_QUEUE_VERIFY=docwriter-verify export SERVICE_BUS_QUEUE_REWRITE=docwriter-rewrite export SERVICE_BUS_QUEUE_DIAGRAM_PREP=docwriter-diagram-prep export SERVICE_BUS_QUEUE_DIAGRAM_RENDER=docwriter-diagram-render export SERVICE_BUS_QUEUE_FINALIZE_READY=docwriter-finalize-ready export SERVICE_BUS_TOPIC_STATUS=docwriter-status export SERVICE_BUS_STATUS_SUBSCRIPTION=status-writer export DOCWRITER_WRITE_BATCH_SIZE=5 # sections per write batch export DOCWRITER_REVIEW_BATCH_SIZE=3 # sections per review batch (per agent) export DOCWRITER_REVIEW_MAX_PROMPT_TOKENS=12000 # prompt cap to split review batches safely export AZURE_STORAGE_CONNECTION_STRING=... export AZURE_BLOB_CONTAINER=docwriter export PLANTUML_SERVER_URL=https://plantuml.example.com/plantuml export DOCWRITER_PLANTUML_REFORMAT_MODEL=gpt-5 # optional export APPINSIGHTS_INSTRUMENTATION_KEY=... # optional export NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
- Run locally (API + worker job runner + UI)
For lightweight testing, you can enqueue messages directly:
# API uvicorn api.main:app --reload # Worker job runner (one message per execution) DOCWRITER_WORKER_KIND=queue \ DOCWRITER_WORKER_STAGE=plan \ DOCWRITER_WORKER_QUEUE=docwriter-plan \ python -m docwriter.job_runner # UI npm run dev --prefix ui
python -m docwriter.queue --queue plan --message-file samples/plan.json.
- Intake Questions (
PLAN_INTAKE) – Interviewer agent drafts clarifying questions; responses land injobs/{job_id}/intake/. - Intake Resume (
INTAKE_RESUME) – Answers hydrate the job context and allocate the working document blob. - Plan (
PLAN) – Planner builds the outline, glossary, style guardrails, and PlantUML specs (plan.json). - Write (
WRITE) – Writer creates sections in dependency order in configurable batches (DOCWRITER_WRITE_BATCH_SIZE), embeds diagram stubs, tracks summaries, and storesdraft.md. - Review (
REVIEW) – Reviewer ensemble runs on split queues (general → style → cohesion → executive summary) and now batches multiple sections per call (DOCWRITER_REVIEW_BATCH_SIZE, prompt-capped byDOCWRITER_REVIEW_MAX_PROMPT_TOKENS) to reduce tokens and queue churn. - Verify (
VERIFY) – Applies reviewer revisions, flags contradictions, and notes placeholders to drive targeted rewrites. - Rewrite (
REWRITE) – Rewrites only affected sections using dependency context and combined guidance. - Diagram Prep (
DIAGRAM) – Extracts PlantUML/Mermaid blocks, uploads sanitized.pumlsources, and queues render requests (emitting SKIPPED/QUEUED events). - Diagram Render (
DIAGRAM) – Calls the PlantUML server, stores PNG/SVG outputs, and signals START/DONE/FAILED counts to the status timeline. - Finalize (
FINALIZE) – Applies rendered diagrams, injects the title page, numbers headings, ensures internal links remain clickable, and exports Markdown/PDF/DOCX.
.github/workflows/docker-build.ymlbuilds all Function + API images and automatically invokes the Terraform workflow, passing the resolved Docker tag..github/workflows/terraform.ymlprovisions Azure resources (Service Bus, Blob Storage, Container Apps, monitoring). Triggered manually or viaworkflow_callfrom the build pipeline.
- Start locally:
uvicorn api.main:app --reload POST /jobs→ Enqueue a document ({"title": "...", "audience": "...", "cycles": 2})POST /jobs/{job_id}/resume→ Upload intake answers ({"answers": {...}}) and advance the pipelineGET /jobs/{job_id}/status→ Latest stage snapshotGET /jobs/{job_id}/timeline→ Full chronological list of status events (including review cycles and durations)GET /jobs/artifacts?path=jobs/<job_id>/final.md→ Download an artifact via the API (proxy for Blob Storage)POST /intake/questions→ Generate a tailored intake questionnaireGET /healthz→ Basic health-check- (Authentication and additional endpoints will be added alongside the UI.)
- The
ui/Next.js app shows job status, timeline, token usage, durations, and model names with glassmorphism styling. - Timeline view displays every stage; review cycles can expand to show each pass (review, verify, rewrite) with metrics.
- Artifact actions use the
/jobs/artifactsAPI so no direct Blob permissions are needed.
- Agents
- Planner (o3): Produces a structured document plan, outline, glossary, constraints, and diagram specs.
- Writer (gpt-4.1): Writes section-by-section with a shared memory (style + facts) to maintain consistency.
- Verifier (o3): Checks contradictions against dependency summaries and flags required rewrites.
- Rewriter (o3): Applies targeted rewrites using verifier + reviewer guidance.
- Reviewers (o3):
- General reviewer: contradictions and quality with revised draft.
- Style reviewer: clarity, tone, readability.
- Cohesion reviewer: flow, transitions, cross-references.
- Executive summary reviewer: summary quality.
- Interviewer (o3): Conducts an intake to gather goals, audience specifics, constraints, and preferences.
- Diagram formatter: Cleans PlantUML and ensures valid syntax before render.
- Finalizer: Applies rendered diagrams, numbers headings, preserves links, and emits PDF/DOCX.
- Stage workers
- PLAN_INTAKE (questions to Blob) → INTAKE_RESUME (user answers) → PLAN → WRITE (batched) → REVIEW (general → style → cohesion → executive summary, batched per stage) → VERIFY → REWRITE → back to REVIEW (loop by cycles) → VERIFY → FINALIZE
- Enforces dependency-aware order, configurable batching for write/review, and performs contradiction verification and targeted rewrites.
- Clients
- OpenAI client abstraction supports streaming and model selection per agent.
- Service Bus producer/worker and Status topic for horizontal scaling and monitoring.
flowchart LR
subgraph Intake
A[docwriter generate
enqueue job] --> B{{Queue: plan-intake}}
B --> C([worker-plan-intake
interviewer])
C --> D{{Queue: intake-resume}}
D --> E([worker-intake-resume])
end
subgraph Plan & Write
E --> F{{Queue: plan}}
F --> G([worker-plan
PlannerAgent])
G --> H{{Queue: write}}
H --> I([worker-write
WriterAgent])
end
subgraph Review Loop
I --> J{{Queue: review}}
J --> K([worker-review
general/style/cohesion/exec])
K --> L{{Queue: verify}}
L --> M([worker-verify
contradiction check])
M -->|Needs rewrite| N{{Queue: rewrite}}
N --> O([worker-rewrite])
O --> J
M -->|Clean or cycles exhausted| P{{Queue: finalize}}
I -.-> X{{Queue: diagram-prep}}
X --> Y([worker-diagram-prep
sanitize PlantUML])
Y --> Z{{Queue: diagram-render}}
Z --> AA([worker-diagram-render
PlantUML server])
AA --> P
end
P --> Q([worker-finalize
final doc to Blob])
Q -.-> R((Status Topic))
J -.-> R
H -.-> R
F -.-> R
- Container Apps Jobs per stage: Each queue/topic processor runs as an event-driven Azure Container Apps Job (KEDA scale rule), reusing existing
process_*handlers. - Container Apps deployment: Worker jobs and the REST API are packaged as container images and hosted on Azure Container Apps for horizontal scaling and simplified ops.
- Public API: A lightweight FastAPI service will mirror core CLI commands (enqueue job, resume intake, status polling) so future web clients can integrate without shell access. CLI remains fully supported.
- Terraform IaC: Infrastructure modules (
infra/terraform) will provision Service Bus resources, Blob Storage, Container Apps, monitoring, and identity wiring. Sample Dockerfiles and deployment scripts will live ininfra/dockerandscripts/. infra/terraformnow contains initial modules for the resource group, Service Bus namespace, storage account, monitoring, and Container Apps environment. Provide container image references viaterraform.tfvarsbefore runningterraform init && terraform apply..github/workflows/docker-build.ymlbuilds/pushes all container images to an Azure Container Registry using OpenID Connect. Configure repository secretsAZURE_CLIENT_ID,AZURE_TENANT_ID,AZURE_SUBSCRIPTION_ID,ACR_NAME, andACR_LOGIN_SERVERbefore enabling the workflow.- Shared configuration: All runtimes (CLI, worker jobs, API) read configuration exclusively from environment variables.
- Detailed in the Quick Start section. Timeline and artifact endpoints give full stage visibility without Blob access.
Target directory highlights
src/
docwriter/ # shared core logic (existing)
docwriter/job_runner.py # ACA Job entrypoint for queue/topic workers
api/ # REST interface built on the same orchestrators
infra/
terraform/ # Service Bus, Storage, Container Apps, monitoring
docker/ # container definitions for functions/API
config/
functions.settings.json # legacy local Functions config example
Documentation and deployment scripts will be updated as those components land.
-
Phase 1 – Marketing & Routing Refresh
- Split the Next.js app into public pages (
/,/features) and an authenticated/workspaceroute. - Introduced the glassy global navigation, new hero sections, and updated CTAs for “Explore workspace” and “Create document.”
- Rewrote the home/feature/workspace hero copy, ensuring benefits/capabilities live in unified gradient sections.
- Split the Next.js app into public pages (
-
Phase 2 – Auth0 Universal Login Integration
- Replaced the ad-hoc Lock flow with the official
@auth0/nextjs-auth0SDK, including/api/auth/[auth0], middleware, and/api/auth/tokenhelper. - Header auth controls now rely on
useUser, handle sign-in/sign-up links with audience/scope params, and show profile chips when authenticated. - All FastAPI routes validate Auth0 access tokens (issuer/audience/JWKS) and derive the user ID from the token instead of headers.
- Replaced the ad-hoc Lock flow with the official
-
Phase 3 – Workspace Document Hub
- Added an Azure Table–backed
DocumentIndexStoreand REST listing endpoint (GET /jobs) scoped to each Auth0 subject. - The workspace landing page now shows a document list (status, last update, artifact download) plus quick “Create new document” entry and detail selection.
- Job creation/resume/status/timeline/artifact download flows include the access token automatically and refresh the document list when state changes.
- Added an Azure Table–backed
- Phase 4 – Document Lifecycle Actions
- Added per-document metadata (cycle counts, last error, artifact availability) stored via the Azure Table index.
- Workspace list now supports search, status filters, manual refresh, and “Resume” actions for failed jobs with download buttons for final artifacts.
- Job creation moved to
/newdocument, keeping the workspace dedicated to portfolio tracking. xw Diagrams
- The plan/review stages emit PlantUML blocks (
```plantumlor@startuml…@enduml). - Diagram-prep extracts these blocks and the diagram-render worker calls your PlantUML server (
PLANTUML_SERVER_URL), storing PNG/SVG files underjobs/<job_id>/images/. - Diagram-prep publishes SKIPPED/QUEUED events; diagram-render emits START/DONE/FAILED updates (with counts) so the UI timeline reflects diagram progress.
- Finalize swaps the original blocks with image links so PDF/DOCX exports embed the rendered graphics.
- PlantUML sources are preformatted by an Azure OpenAI GPT-5 call (
DOCWRITER_PLANTUML_REFORMAT_MODEL) to fix line breaks, indentation, and labels before rendering; ensure the configured model is available in your Azure OpenAI resource. Prep also strips stray Markdown fences and fails early if a diagram body is empty or contains Mermaid to avoid silent render errors. - The writer agent references an internal PlantUML encyclopedia (class, sequence, component, mind map, ERD, etc.) to keep generated diagrams within supported syntax.
Consistency Strategy
- Global memory: style guide, glossary, decisions, and facts shared across sections.
- Section writing constrained by the plan; reviewer iteratively flags contradictions.
- Chunked generation to support very long outputs without context loss.
- Dependency graph: sections generated in topological order; summaries of prerequisites guide dependents.
- Verifier: second pass checks final doc against dependency summaries to catch contradictions.
Status & Monitoring
- Topic: SERVICE_BUS_TOPIC_STATUS (default docwriter-status)
- Worker publishes a message for every stage transition (including START/QUEUED events). Messages include duration, token usage, model name, and artifact path.
- API exposes
/jobs/{job_id}/timelineand/jobs/artifacts?...so external clients can render dashboards without direct Service Bus or Blob access.
Telemetry & Metrics
- Stage timings recorded and uploaded to Blob under jobs/<job_id>/metrics/...
- Optional OpenTelemetry export if OTEL_EXPORTER_OTLP_ENDPOINT is set.
Scalability
- Distributed only: enqueue jobs and run multiple workers per stage.
- Azure Service Bus decouples producers and workers for horizontal scale; Status topic enables dashboards.
-Configuration
- Configuration is driven by environment variables (see Quick Start). No CLI commands remain; everything runs through FastAPI and worker jobs.
- Models: Planner/Reviewers default to
o3, Writer defaults togpt-4.1. Token usage is reported using real API metrics when available. - Frontend UI lives under
ui/(Next.js + Tailwind). SetNEXT_PUBLIC_API_BASE_URLto your API endpoint and runnpm run dev --prefix uifor the glass-styled intake experience.
Testing
- Run: pytest -q
- Tests use a FakeLLM for deterministic results.
Notes
- This repo scaffolds the full flow. Real LLM calls require valid API keys and Azure Service Bus + Blob access.
Use these steps to reconstruct the environment in a fresh session:
-
Container Images
- Build or rely on the GitHub Actions workflow (
docker-build.yml). Images push toaidocwriteracr.azurecr.iowith tags:latestand:v<git describe>. - Local build example from repo root:
docker build --platform linux/amd64 -t aidocwriteracr.azurecr.io/docwriter-worker-job:v1 -f Dockerfile.worker . docker push aidocwriteracr.azurecr.io/docwriter-worker-job:v1 # repeat for API/UI/plantuml images
- Build or rely on the GitHub Actions workflow (
-
Environment Variables (all runtimes)
export OPENAI_API_KEY=... export OPENAI_BASE_URL=... export OPENAI_API_VERSION=... export SERVICE_BUS_NAMESPACE=<service-bus-namespace-name> export SERVICE_BUS_QUEUE_PLAN_INTAKE=docwriter-plan-intake export SERVICE_BUS_QUEUE_INTAKE_RESUME=docwriter-intake-resume export SERVICE_BUS_QUEUE_PLAN=docwriter-plan export SERVICE_BUS_QUEUE_WRITE=docwriter-write export SERVICE_BUS_QUEUE_REVIEW=docwriter-review # legacy alias for general review export SERVICE_BUS_QUEUE_REVIEW_GENERAL=docwriter-review export SERVICE_BUS_QUEUE_REVIEW_STYLE=docwriter-review-style export SERVICE_BUS_QUEUE_REVIEW_COHESION=docwriter-review-cohesion export SERVICE_BUS_QUEUE_REVIEW_SUMMARY=docwriter-review-summary export SERVICE_BUS_QUEUE_VERIFY=docwriter-verify
export SERVICE_BUS_QUEUE_REWRITE=docwriter-rewrite export SERVICE_BUS_QUEUE_DIAGRAM_PREP=docwriter-diagram-prep export SERVICE_BUS_QUEUE_DIAGRAM_RENDER=docwriter-diagram-render export SERVICE_BUS_QUEUE_FINALIZE_READY=docwriter-finalize-ready export SERVICE_BUS_TOPIC_STATUS=docwriter-status export SERVICE_BUS_STATUS_SUBSCRIPTION=console export AZURE_STORAGE_CONNECTION_STRING=... export AZURE_BLOB_CONTAINER=docwriter export PLANTUML_SERVER_URL=https://plantuml.example.com export APPINSIGHTS_INSTRUMENTATION_KEY=... export NEXT_PUBLIC_API_BASE_URL=https://
3. **Terraform deployment**
- Secrets: `spn-client-id`, `spn-client-secret`, `spn-tenant-id`, `subscription-id`, `openai_base_url`, `openai_api_version`, `openai_api_key_secret`.
- Modules provision RG, ACR, Service Bus, Storage, App Insights, Container Apps.
- Container Apps module consumes API/UI images and one worker-job image plus per-job KEDA metadata (`worker_jobs`), `api_env`, `worker_env`, and optional `api_secrets` describing Key Vault secret IDs. Managed identity is granted `Key Vault Secrets User` at vault scope plus Service Bus sender/receiver roles.
- Run:
```bash
terraform -chdir=infra/terraform init
terraform -chdir=infra/terraform apply \
-var "spn-client-id=..." \
-var "spn-client-secret=..." \
-var "spn-tenant-id=..." \
-var "subscription-id=..." \
-var "openai_base_url=..." \
-var "openai_api_version=..." \
-var "openai_api_key_secret=..."
```
4. **Azure Application Insights**
- `APPINSIGHTS_INSTRUMENTATION_KEY` enables automatic event and exception tracking (`job_enqueued`, `job_status`, stage start/completion, and any errors via `track_exception`).
- Review telemetry for lost messages or failures.
5. **Queue-driven Pipeline**
- Messages enqueue with blob-backed draft paths; all workers read/write documents in Blob Storage.
- Intake worker persists questions, context, and sample answers in `jobs/<id>/intake/...`. Resume pulls context blob to rebuild payload.
- Writers, reviewers, verifier, and rewriters operate entirely from blob payloads.
6. **REST + UI**
- API endpoints: `/jobs`, `/jobs/{id}/resume`, `/jobs/{id}/status`, `/intake/questions`, `/healthz`.
- Next.js UI under `ui/` uses Tailwind glass theme; set `NEXT_PUBLIC_API_BASE_URL` before `npm run dev` or in deployment.
- Intake form pre-fills sample answers returned from the API.
7. **CLI**
- `docwriter generate` / `docwriter resume` share the same queue helpers and require the env vars above.
8. **Logging & Monitoring**
- Workers respect `LOG_DIR`/`DOCWRITER_LOG_LEVEL` for console/file logging.
- Application Insights now records all stage transitions and exceptions.
9. **Local sample script**
- `scripts/run_sample_service_bus.sh` validates required env vars, enqueues a job, uploads sample answers, and signals resume.