Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,8 @@
</p>

<p align="center">
<a href="https://github.com/danyellambert/ai-workbench-local/actions/workflows/product-ci.yml">
<img alt="Product CI" src="https://github.com/danyellambert/ai-workbench-local/actions/workflows/product-ci.yml/badge.svg?branch=main" />
</a>
<a href="https://github.com/danyellambert/ai-workbench-local/actions/workflows/deploy-aws.yml">
<img alt="AWS CD" src="https://github.com/danyellambert/ai-workbench-local/actions/workflows/deploy-aws.yml/badge.svg?branch=main" />
</a>
<a href="https://github.com/danyellambert/ai-workbench-local/actions/workflows/product-ci.yml"><img alt="Product CI" src="https://github.com/danyellambert/ai-workbench-local/actions/workflows/product-ci.yml/badge.svg?branch=main" /></a>
<a href="https://github.com/danyellambert/ai-workbench-local/actions/workflows/deploy-aws.yml"><img alt="AWS CD" src="https://github.com/danyellambert/ai-workbench-local/actions/workflows/deploy-aws.yml/badge.svg?branch=main" /></a>
</p>

<p align="center">
Expand Down
123 changes: 122 additions & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Primary documents:
- `docs/product/overview.md`
- `docs/architecture/current-product-surface.md`
- `docs/deployment/MULTI_ENVIRONMENT_CONTRACT.md`
- `docs/deployment/aws-ssm-code-only-deploy.md`
- `docs/operations/ci-cd-and-release-controls.md`
- `scripts/README.md`
- `tests/README.md`

Expand Down Expand Up @@ -126,6 +128,16 @@ scripts/readiness_multi_environment_contract_check.sh
| 12 | Completed foundation | Production-readiness baseline, Golden Surface, provider strategy, frontend parity, and Docker real-backend discipline |
| 13 | Completed / legacy-deferred | Oracle-like Docker, public/admin overlays, hardening, sidecars, and later AWS convergence |
| 14 | In progress | Final validation, repository presentation, release polish, screenshots, and public handoff |
| 15 | Completed | Public product identity, branding, README narrative, and repository presentation |
| 16 | Completed | Public/admin runtime policy, session overlays, quotas, and demo safety |
| 17 | Completed | Cloud deployment hardening and domain-ready AWS operation |
| 18 | Completed | Product workflow reliability and long-running execution behavior |
| 19 | Completed | Delivery layer and external integration productization |
| 20 | Completed | AI Lab, provider controls, evals, and runtime observability |
| 21 | Completed | Publication readiness and current product gate |
| 22 | Completed | Public setup package and repository hygiene |
| 23 | Completed | GitHub Actions Product CI and AWS CD |
| 24 | Completed | SSM-based AWS deployment and low-disk rebuild hardening |

There is no intended gap between 10.5 and 13. After the original Phase 11
publication plan, the project entered a production-readiness runbook that used
Expand Down Expand Up @@ -1265,6 +1277,104 @@ Primary references:
- `.github/workflows/product-ci.yml`;
- `scripts/run_current_test_gate.sh`.

### Phase 22 - Public Setup Package and Repository Hygiene

Purpose:

- make the repository runnable from a clean public clone without relying on private local assumptions;
- keep private credentials, runtime payloads, scratch files, and generated material out of Git;
- align the setup path, documentation indexes, assets, and workflow names with the current product contract.

Completed work:

- expanded the README quickstart and linked it to a deeper full local product setup guide;
- documented the full local path for Docker, admin mode, Nextcloud/WebDAV, provider credentials, deck generation, Trello, and Notion;
- sanitized Docker and AWS example env files so public configuration examples describe required values without exposing private metadata;
- reorganized `.gitignore` around current product boundaries, private env files, local scratch material, generated frontend output, and ignored design drafts;
- added canonical public visual assets for the current product presentation path;
- renamed evaluation workflows away from phase-specific naming so they read as maintained engineering workflows instead of historical experiments.

Current status:

- the public setup path is documented as a product package rather than a collection of scripts;
- real secrets, local runtime state, and private baseline archives remain outside source control;
- the active setup story points to Docker, Product API, frontend, sidecars, and the current documentation map.

Primary references:

- `README.md`;
- `docs/deployment/FULL_LOCAL_PRODUCT_SETUP.md`;
- `.env.docker.example`;
- `.env.aws.example`;
- `.gitignore`;
- `.github/workflows/evals.yml`;
- `.github/workflows/evals-live.yml`;
- `docs/assets/product/hero-decision-loop.webp`.

### Phase 23 - GitHub Actions Product CI and AWS CD

Purpose:

- make the current product gate visible, repeatable, and tied to the repository's main branch;
- separate active CI/CD controls from legacy exploratory test material;
- ensure AWS deployment runs only after the current product contract has passed validation.

Completed work:

- added Product CI for current Product API Python entrypoints, frontend dependency install, frontend tests, frontend build, shell syntax validation, and compose/readiness contract checks;
- kept the current Python test gate as a focused validation job using root `requirements.txt`;
- added AWS CD as a protected deployment workflow with manual `dry-run` / `execute` modes;
- connected AWS CD to successful Product CI runs on `main`;
- configured GitHub OIDC for AWS credentials and kept instance identifiers, SSH material, and deployment targets in GitHub secrets or environment variables;
- validated deploy scripts inside Actions before any remote deployment step runs.

Current status:

- the repository has an auditable CI/CD control plane for product validation and AWS deployment;
- Product CI and AWS CD are documented as operational gates, not as separate demo scripts;
- manual deploy runs can be inspected before live execution through the `dry-run` path.

Primary references:

- `.github/workflows/product-ci.yml`;
- `.github/workflows/deploy-aws.yml`;
- `scripts/run_current_test_gate.sh`;
- `scripts/readiness_multi_environment_contract_check.sh`;
- `docs/operations/ci-cd-and-release-controls.md`.

### Phase 24 - SSM-Based AWS Deployment and Low-Disk Rebuild Hardening

Purpose:

- support code-only AWS rebuilds on the existing EC2 host without opening a broad public SSH path;
- keep deployments reproducible when the host has constrained disk space;
- preserve runtime state, secrets, and Docker volumes while replacing application code and container images.

Completed work:

- added a GitHub Actions AWS deployment path that reaches EC2 through an AWS SSM port-forwarding tunnel;
- introduced SSM deploy scripts for code-only deployment, remote bundle creation, persistent-path guardrails, and dry-run verification;
- made the remote deploy path fetch the exact Git commit, build a clean deployment bundle, and validate that no runtime data or secrets are included;
- hardened AWS rebuilds for low-disk hosts by pruning unused Docker data before the build and relaxing dry-run disk preflight into an advisory warning;
- fixed Ollama readiness handling so repeated deploys do not collide on a stale temporary readiness file;
- kept smoke/readiness checks around the same five-service AWS compose contract.

Current status:

- AWS CD can deploy the current code through SSM and rebuild the product stack with low-disk safeguards;
- code-only deployment updates app files and images while preserving mounted runtime state, private secrets, baselines, and Docker volumes;
- the deployment remains traceable to a Git commit and a deployment-bundle report.

Primary references:

- `.github/workflows/deploy-aws.yml`;
- `scripts/deploy_aws_code_only.sh`;
- `scripts/deploy_aws_code_only_ssm.sh`;
- `scripts/deploy_aws_code_only_ssm_remote.sh`;
- `scripts/deploy_aws.sh`;
- `scripts/smoke_aws.sh`;
- `docs/deployment/aws-ssm-code-only-deploy.md`.

---

## 5. Current Integrations and Their Status
Expand Down Expand Up @@ -1342,6 +1452,10 @@ scripts/run_current_test_gate.sh
scripts/readiness_multi_environment_contract_check.sh
```

GitHub Actions now runs this contract through Product CI, and AWS CD is connected
to successful Product CI runs on `main` with a manual `dry-run` / `execute`
control for deployment.

Important interpretation:

- the full historical Python discovery suite is not the canonical validation gate;
Expand All @@ -1365,6 +1479,8 @@ Additional focused checks exist for:
- EvidenceOps UI cache;
- Trello public visibility;
- AWS env contract.
- GitHub Actions Product CI;
- AWS CD through SSM-based code-only deployment.

---

Expand All @@ -1388,6 +1504,8 @@ The repository demonstrates:
- overlay-based demo safety;
- deck/artifact generation through a dedicated rendering sidecar;
- Docker and AWS deployment discipline;
- GitHub Actions CI/CD discipline;
- SSM-based AWS deployment hardening;
- repository cleanup and operational documentation.

---
Expand Down Expand Up @@ -1450,7 +1568,10 @@ AI Decision Studio evolved through a coherent engineering arc:
13. deck/artifact generation;
14. production-readiness baseline and public/admin overlay policy;
15. local Docker and AWS deployment;
16. repository organization for long-term maintainability.
16. repository organization for long-term maintainability;
17. public setup packaging and repository hygiene;
18. GitHub Actions Product CI and AWS CD;
19. SSM-based AWS deployment and low-disk rebuild hardening.

The next milestone is final publication polish: screenshots, architecture diagram,
demo narrative, release tag, and one final preindexed Nextcloud fast-import proof.
2 changes: 2 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,11 @@ Start here if you are new to the repository:
- architecture/data-payload.md — the versioned Docker/AWS data payload and the four mounted roots.
- deployment/local-docker-compose.md — how to run the current product locally with Docker Compose.
- deployment/aws-deploy.md — how the AWS deployment is structured.
- deployment/aws-ssm-code-only-deploy.md — how GitHub Actions deploys AWS code-only releases through SSM.
- deployment/deployment-evolution.md — how local Docker, AWS, Caddy, restore, and credentials converged.
- deployment/python-dependencies.md — current single-file Python dependency contract.
- operations/backup-and-restore.md — operational backup and restore notes for local/AWS data roots.
- operations/ci-cd-and-release-controls.md — Product CI, AWS CD, release controls, secrets boundaries, and eval workflow naming.
- operations/engineering-controls.md — public/admin boundaries, quotas, readiness gates, credentials, and observability.
- guides/ — task-oriented supporting workflow guides.
- reference/ — reference material for benchmarks and evidence/CV workflows.
Expand Down
4 changes: 4 additions & 0 deletions docs/architecture/evals/ai-engineering-trajectory.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,13 @@ Important threads:
- model comparison and provider selection;
- eval typing and live verdict labels;
- current benchmark/eval UI surfaces.
- public workflow naming for maintained eval checks through `evals.yml` and
`evals-live.yml`, replacing phase-numbered workflow names.

Primary references:

- `.github/workflows/evals.yml`
- `.github/workflows/evals-live.yml`
- `docs/architecture/evals/benchmark-execution.md`
- `docs/architecture/evals/decision-gate.md`
- `docs/architecture/evals/closure.md`
Expand Down
2 changes: 2 additions & 0 deletions docs/deployment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ Deployment runbooks and environment contracts for local Docker and AWS. Oracle-l
- [AI_LAB_GOLDEN_STATE_RESTORE.md](AI_LAB_GOLDEN_STATE_RESTORE.md) — AI Lab Golden State Restore
- [aws-cost-audit.md](aws-cost-audit.md) — AWS Cost and Resource Audit
- [aws-deploy.md](aws-deploy.md) — AWS deploy
- [aws-ssm-code-only-deploy.md](aws-ssm-code-only-deploy.md) — AWS SSM code-only deploy
- [AWS_FRESH_EC2_BOOTSTRAP.md](AWS_FRESH_EC2_BOOTSTRAP.md) — Axiovance — AWS fresh EC2 bootstrap runbook
- [deployment-evolution.md](deployment-evolution.md) — Deployment evolution from local Docker to AWS, Caddy, restore, and credential contracts
- [FULL_LOCAL_PRODUCT_SETUP.md](FULL_LOCAL_PRODUCT_SETUP.md) — Full local product setup
- [local-docker-compose.md](local-docker-compose.md) — Local Docker Compose
- [LOCAL_FULL_APP_DEV.md](LOCAL_FULL_APP_DEV.md) — Local full app development
- [MULTI_ENVIRONMENT_CONTRACT.md](MULTI_ENVIRONMENT_CONTRACT.md) — Axiovance — Multi-environment contract
Expand Down
170 changes: 170 additions & 0 deletions docs/deployment/aws-ssm-code-only-deploy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# AWS SSM Code-Only Deploy

This document describes the current AWS deployment path for updating the product
on an existing EC2 host through GitHub Actions, AWS SSM, and a code-only rebuild.

## Purpose

The SSM code-only path exists to deploy the current product without treating the
EC2 host as disposable infrastructure and without committing runtime state.

It is designed to:

- deploy the exact Git commit selected by GitHub Actions;
- keep mounted runtime, artifacts, users, private baselines, secrets, and Docker
volumes in place;
- replace application code and rebuild images safely;
- support `dry-run` checks before live execution;
- reduce public SSH dependency by using SSM port forwarding;
- tolerate constrained disk space during rebuilds.

## High-Level Flow

1. Product CI validates the current product contract.
2. AWS CD checks out the deploy commit.
3. AWS CD assumes the AWS deployment role through GitHub OIDC.
4. AWS CD validates the deploy scripts with `bash -n`.
5. AWS CD opens an AWS SSM port-forwarding tunnel to the target EC2 host's SSH
port.
6. `scripts/deploy_aws_code_only.sh` connects through the local SSM tunnel.
7. The remote deploy path fetches the same Git commit from GitHub.
8. A deployment bundle is built from that source checkout.
9. Bundle checks confirm that secrets, real env files, runtime data, backups,
and heavy generated payloads are excluded.
10. In `dry-run`, the process stops after validating the host and bundle.
11. In `execute`, the app is replaced, images are rebuilt, and the stack is
restarted while persistent runtime paths remain mounted.
12. Smoke/readiness checks validate the live product path.

## Scripts

### `.github/workflows/deploy-aws.yml`

The GitHub Actions workflow owns:

- manual `dry-run` / `execute` dispatch;
- post-Product-CI deployment on `main`;
- AWS OIDC credential setup;
- protected `aws-production` environment usage;
- SSM tunnel creation;
- temporary SSH key setup inside the runner;
- deploy-script syntax validation.

### `scripts/deploy_aws_code_only.sh`

This is the existing code-only SSH deploy entrypoint. In the SSM workflow it is
pointed at the local tunnel host alias instead of a public EC2 address.

It is responsible for:

- selecting `--dry-run` or `--execute`;
- building or transferring the deploy bundle path expected by AWS;
- invoking the remote update flow;
- preserving runtime data and private env files.

### `scripts/deploy_aws_code_only_ssm.sh`

This is the SSM command driver for direct SSM execution. It sends a remote shell
command to the EC2 instance through AWS Systems Manager and points that command
at `scripts/deploy_aws_code_only_ssm_remote.sh` for the selected Git commit.

### `scripts/deploy_aws_code_only_ssm_remote.sh`

This remote script performs the host-side code-only deploy work:

- checks persistent paths and permissions;
- validates Docker and Docker Compose availability;
- checks expected Docker volumes;
- fetches the selected Git commit into temporary staging;
- builds the deployment bundle from release source;
- verifies the bundle report;
- supports `dry-run` without changing live app files;
- extracts and applies the new app in `execute` mode;
- keeps private data roots, secret roots, and Docker volumes intact.

### `scripts/deploy_aws.sh`

This script remains the AWS stack deployment contract. It handles the Docker
Compose update path, smoke/readiness behavior, Ollama readiness, and low-disk
rebuild hardening used by the code-only flow.

## Persistent State Boundary

The deploy is intentionally code-only.

It must not replace:

- `.env.aws` on the EC2 host;
- `/opt/ai-decision-studio/data/baseline`;
- `/opt/ai-decision-studio/data/runtime`;
- `/opt/ai-decision-studio/data/artifacts`;
- `/opt/ai-decision-studio/data/users`;
- `/opt/ai-decision-studio/secrets`;
- private baseline archives;
- Nextcloud, Ollama, Caddy, and PPT Creator Docker volumes.

Those paths are operational state. The release bundle updates product source,
safe deploy files, docs, frontend assets, and container build inputs.

## Low-Disk Rebuild Hardening

The EC2 host can have limited free disk during rebuilds. The current deploy path
therefore includes low-disk safeguards:

- dry-run disk checks are advisory so the host can still report its state;
- unused Docker data can be pruned before a live build;
- deployment staging is cleaned after use;
- bundle validation avoids sending runtime payloads into the app archive;
- Ollama readiness uses temporary handling that avoids stale file collisions;
- the rebuild path focuses on the current five-service product stack.

The goal is not to hide disk pressure. The goal is to keep deployment possible
while making disk pressure visible in logs and preserving persistent state.

## Security Boundary

AWS deployment uses several boundaries:

- GitHub OIDC instead of a long-lived AWS access key in the repository;
- protected GitHub environment settings for AWS production;
- secrets for EC2 instance identity and temporary SSH material;
- SSM port forwarding for the SSH path used by Actions;
- private EC2 env files and credential stores outside Git;
- deployment bundles that reject real env files and runtime data.

The public application entry remains the frontend through the AWS ingress layer.
Product API, Nextcloud, Ollama, and PPT Creator remain behind the Docker network
boundary.

## Verification

Useful verification points:

```bash
bash -n scripts/deploy_aws_code_only.sh
bash -n scripts/deploy_aws_code_only_ssm.sh
bash -n scripts/deploy_aws_code_only_ssm_remote.sh
bash -n scripts/deploy_aws.sh
bash -n scripts/smoke_aws.sh
```

For deployment:

- run AWS CD in `dry-run` to validate host preflight and bundle safety;
- run AWS CD in `execute` to apply a code-only rebuild;
- inspect Actions logs for the deploy commit and SSM tunnel readiness;
- inspect smoke/readiness output after the stack restarts.

## Primary References

- `.github/workflows/deploy-aws.yml`
- `.github/workflows/product-ci.yml`
- `scripts/deploy_aws_code_only.sh`
- `scripts/deploy_aws_code_only_ssm.sh`
- `scripts/deploy_aws_code_only_ssm_remote.sh`
- `scripts/deploy_aws.sh`
- `scripts/smoke_aws.sh`
- `scripts/build_deployment_bundle.sh`
- `docs/deployment/aws-deploy.md`
- `docs/deployment/deployment-evolution.md`
- `docs/operations/ci-cd-and-release-controls.md`
Loading
Loading