Kale Deploy is a GitHub-first publishing system from the CUNY AI Lab. It lets a user work with an AI agent, connect a GitHub repository, and publish a website or Worker-based web app to a host-based project URL like https://<project-name>.cuny.qzz.io.
- Public front door:
https://cuny.qzz.io/kale - Runtime manifest:
https://runtime.cuny.qzz.io/.well-known/kale-runtime.json - Project URL pattern:
https://<project-name>.cuny.qzz.io
The lower-level Cloudflare Workers still exist behind that front door:
- deploy-service:
https://cail-deploy-service.ailab-452.workers.dev - internal gateway:
https://cail-gateway.ailab-452.workers.dev/<project-name>
packages/deploy-service: control-plane Worker for onboarding, GitHub App setup, MCP/OAuth, project registration, validation, and deployment state.packages/gateway-worker: internal gateway Worker that dispatches requests to deployed project Workers.packages/project-host-proxy: wildcard host proxy oncuny.qzz.iothat routes public traffic to the gateway and exposes the public runtime manifest.packages/build-runner: managed Node-based runner that pulls build jobs from Cloudflare Queues, builds Worker artifacts, and reports back to the deploy service.packages/build-contract: shared types used by the deploy service and the runner.platform: assistant-neutral runtime contract in Markdown and JSON.plugins/kale-deploy: shared Codex and Claude Code plugin bundle plus thekale-initassistant workflow.gemini-extension.jsonplusskills/kale-deploy: Gemini extension install surface for the same Kale guidance.
At a high level:
- A user connects a repository through the shared GitHub App.
- A push to the default branch reaches the deploy-service webhook.
- The deploy service creates a GitHub check, stores state in D1, and enqueues a build job.
- The build runner checks out the repository, builds the deployable bundle, and calls back with artifact metadata plus runtime evidence.
- The deploy service either promotes a certified static build onto the shared-static lane or uploads a dedicated Worker to Workers for Platforms, updates deployment state, and completes the GitHub check.
- Public traffic reaches
https://<project-name>.cuny.qzz.iothrough the host proxy and gateway.
.
├── docs/
├── packages/
│ ├── build-contract/
│ ├── build-runner/
│ ├── deploy-service/
│ ├── gateway-worker/
│ └── project-host-proxy/
├── platform/
└── plugins/
└── kale-deploy/
Install dependencies from the repo root:
npm installRun the workspace checks:
npm run checkRun the deploy-service tests:
npm run test --workspace @cuny-ai-lab/deploy-serviceApply the deploy-service D1 migrations before deploying the Workers:
npx wrangler d1 migrations apply cail-control-plane --config packages/deploy-service/wrangler.jsoncDeploy the core Workers:
npx wrangler deploy --config packages/deploy-service/wrangler.jsonc
npx wrangler deploy --config packages/gateway-worker/wrangler.jsonc
npx wrangler deploy --config packages/project-host-proxy/wrangler.jsoncSet the shared preview secret on both Workers before expecting automatic shared_static cutover:
npx wrangler secret put GATEWAY_PREVIEW_TOKEN --config packages/deploy-service/wrangler.jsonc
npx wrangler secret put GATEWAY_PREVIEW_TOKEN --config packages/gateway-worker/wrangler.jsoncThe build runner is separate from the Workers deployment. If you change packages/build-runner or packages/build-contract, rebuild and restart the runner too so production classification matches the deployed Workers. See the runner hosting guide below before changing how it is hosted.
There is also a real lifecycle smoke runner for a dedicated smoke repository:
npm run ops:lifecycle-smokeIt uses the public Kale MCP tool surface plus real Git pushes to exercise:
- an initial live dedicated Worker deploy
- a same-lane dedicated Worker redeploy
- a redeploy that promotes the same repo onto
shared_static - a same-lane
shared_staticredeploy - a redeploy that demotes the repo back onto
dedicated_worker - a real
delete_projectplus public unpublish check - a fresh re-register plus restore deploy
Required environment variables for the smoke runner:
KALE_SMOKE_REPOSITORYKALE_SMOKE_GITHUB_TOKENKALE_SMOKE_KALE_TOKEN
The GitHub token must be able to push to the dedicated smoke repo. The Kale token is used as a bearer token on /mcp, and in practice it should be a fresh kale_pat_* token from https://cuny.qzz.io/kale/connect under a user who already linked GitHub admin access for that smoke repo, because the delete step uses the same project-admin check as the live product.
Kale Deploy exposes:
- a remote MCP server at
POST /mcp - MCP OAuth metadata under
/.well-known/oauth-* - a machine-readable runtime manifest at
https://runtime.cuny.qzz.io/.well-known/kale-runtime.json - structured JSON endpoints for repo registration, validation, and status polling
The MCP surface is a thin adapter over the deploy-service control plane, not a separate backend.
Current harness note:
- The public setup page now shows short per-agent install instructions for the
kale-deployadd-on, then a plain-language build prompt. - Codex normally uses the Kale add-on install path; direct
codex mcp addpluscodex mcp loginis the manual fallback. - Claude Code should use
mcp-remoteonly to complete OAuth, then finalize to one direct HTTPkaleserver whoseheadersHelperreads the latest valid Kale OAuth token through the installedkale-claude-connect.mjshelper; if a refresh token is cached the helper refreshes automatically, and/connectstays the last resort. - Claude has an explicit user-scope plugin update command:
claude plugins update kale-deploy@cuny-ai-lab -s user. - Gemini CLI supports extension updates directly, and the preferred public install path now uses
--auto-update. - The runtime manifest is the source of truth for harness-specific install and update policy if local plugin copy drifts.
- Agents should read
dynamic_skill_policy,client_update_policy, andagent_harnessesfromget_runtime_manifest, then usedynamicSkillPolicy,clientUpdatePolicy,harnesses,nextAction, andsummaryfromtest_connectionas the live source of truth.
Implemented now:
- GitHub App manifest flow and webhook handling
- Cloudflare Access-backed OAuth for MCP clients
- structured repo registration, validation, and deployment status APIs
- Workers for Platforms project deployment
- host-based project URLs on
*.cuny.qzz.io - Codex and Claude plugin packaging
- an AWS-hosted build runner with a stable Elastic IP and SSM-ready instance profile
- a scheduled GitHub Actions healthcheck for the public stack and AWS runner
- a scheduled GitHub Actions lifecycle smoke workflow for real dedicated/shared-static/deletion/redeploy scenario checks on a dedicated smoke repo
- no daily validate/deploy cap by default, but support for per-repository override caps
- repo-scoped
DB,FILES, andCACHEprovisioning - approval-only policy for advanced bindings such as AI, Vectorize, and Rooms
- explicit
staticandworkerstarter shapes for new projects - conservative automatic
shared_staticcutover for certified pure static builds - internal groundwork for future custom-domain routing
Still not finished:
- fully polished onboarding with no human GitHub approval step
- full external custom-domain support
- a richer admin surface for per-project caps
- institution-wide support polish beyond the current pilot shape
- Student quickstart: docs/quickstart-students.md
- Claude quickstart: docs/claude-quickstart.md
- Codex quickstart: docs/codex-quickstart.md
- Support matrix: docs/support-matrix.md
- Troubleshooting: docs/troubleshooting.md
- Operator runbook: docs/runbook.md
- Pilot policy: docs/pilot-policy.md
- Pilot checklist: docs/pilot-checklist.md
- Agent API: docs/agent-api.md
- Cloudflare Access auth: docs/cloudflare-access-auth.md
- GitHub App setup: docs/github-app-setup.md
- Friendly URL rollout: docs/friendly-url-rollout.md
- Future custom domains plan: docs/custom-domains-plan.md
- Build runner hosting: docs/build-runner-hosting.md
- Build runner contract: docs/build-runner-contract.md
Deployment metadata stays in D1. Artifact retention is intentionally bounded:
- keep the latest two successful deployment bundles per project in R2 for rollback
- keep failed deployment manifests briefly for debugging
- prune older R2 artifacts while preserving deployment metadata in D1