Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions .github/workflows/cd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ on:

permissions:
contents: read
packages: write

jobs:
deploy:
Expand Down Expand Up @@ -53,3 +54,100 @@ jobs:
infra/ansible/playbook.yml \
-e "repo_url=https://github.com/${{ github.repository }}.git" \
-e "genai_env_file=/tmp/genai.env"

# ------------------------------------------------------------------
# Build & push all service images to GitHub Container Registry.
# Kubernetes pulls these (unlike the VM, which builds locally), so this
# runs before the Helm deploy.
# ------------------------------------------------------------------
docker-push:
name: docker-push (ghcr)
runs-on: ubuntu-latest
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
include:
- service: organization-service
context: services/spring-organization
- service: member-service
context: services/spring-member
- service: event-service
context: services/spring-event
- service: feedback-service
context: services/spring-feedback
- service: finance-service
context: services/spring-finance
- service: letter-service
context: services/spring-letter
- service: py-genai-helper
context: services/py-genai-helper
- service: web-client
context: web-client
- service: api-docs
context: api
steps:
- uses: actions/checkout@v4

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Log in to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Compute lowercase image name
id: img
run: echo "repo=ghcr.io/${GITHUB_REPOSITORY,,}/${{ matrix.service }}" >> "$GITHUB_OUTPUT"

- name: Build & push
uses: docker/build-push-action@v6
with:
context: ${{ matrix.context }}
push: true
tags: |
${{ steps.img.outputs.repo }}:${{ github.sha }}
${{ steps.img.outputs.repo }}:latest
cache-from: type=gha
cache-to: type=gha,mode=max

# ------------------------------------------------------------------
# Deploy to the RKE2 Kubernetes cluster via Helm.
# ------------------------------------------------------------------
deploy-k8s:
name: deploy (Kubernetes/Helm)
runs-on: ubuntu-latest
timeout-minutes: 20
needs: docker-push
env:
NAMESPACE: ge83mom-devops26
steps:
- uses: actions/checkout@v4

- name: Set up Helm
uses: azure/setup-helm@v4

- name: Write kubeconfig
run: |
mkdir -p ~/.kube
printf '%s' "${{ secrets.KUBECONFIG }}" > ~/.kube/config
chmod 600 ~/.kube/config

- name: Create/refresh genai-env secret
env:
GENAI_ENV_CONTENT: ${{ secrets.GENAI_ENV_CONTENT }}
run: |
printf '%s\n' "$GENAI_ENV_CONTENT" > /tmp/genai.env
kubectl -n "$NAMESPACE" create secret generic genai-env \
--from-env-file=/tmp/genai.env \
--dry-run=client -o yaml | kubectl apply -f -

- name: Helm upgrade
run: |
helm upgrade --install team-devoops infra/helm/team-devoops \
--namespace "$NAMESPACE" \
--set global.image.tag=${{ github.sha }} \
--wait --timeout 5m
33 changes: 33 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,38 @@ jobs:
with:
file_glob: 'api/openapi.yaml'

# ------------------------------------------------------------------
# Helm chart lint + render + schema validation.
# ------------------------------------------------------------------
helm-validate:
name: helm-validate
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4

- name: Set up Helm
uses: azure/setup-helm@v4

- name: Lint chart
run: helm lint infra/helm/team-devoops

- name: Render templates
run: |
helm template team-devoops infra/helm/team-devoops \
--set global.image.tag=ci-validate > /tmp/rendered.yaml
test -s /tmp/rendered.yaml

- name: Install kubeconform
run: |
curl -sSLo kubeconform.tar.gz \
https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
tar xf kubeconform.tar.gz
sudo mv kubeconform /usr/local/bin/

- name: Validate rendered manifests
run: kubeconform -strict -summary /tmp/rendered.yaml

# ------------------------------------------------------------------
# Aggregate gate - use this single check for branch protection.
# ------------------------------------------------------------------
Expand All @@ -451,6 +483,7 @@ jobs:
- docker-build
- codeql
- openapi-lint
- helm-validate
steps:
- name: Verify all required jobs succeeded
run: |
Expand Down
9 changes: 9 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ repos:
# Helm charts under infra/; current YAML files are all
# single-document and unaffected.
args: [--allow-multiple-documents]
# docker-compose.override.yml uses Compose's custom `!override`
# tag, which strict YAML parsers (PyYAML) cannot resolve.
# infra/helm/**/templates/* are Helm (Go-template) files, not plain
# YAML, so PyYAML cannot parse their `{{ ... }}` directives.
exclude: |
(?x)^(
infra/docker-compose\.override\.yml|
infra/helm/.*/templates/.*
)$
- id: check-json
# tsconfig*.json files use JSONC (comments allowed), which is
# valid for TypeScript but rejected by strict JSON parsers.
Expand Down
120 changes: 118 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ repo/
│ └── generated/ # ⚠ Generated — do not edit by hand
├── web-client/ # React SPA (Vite, TypeScript)
│ └── src/api.ts # ⚠ Generated — do not edit by hand
├── infra/ # docker-compose, Traefik config, Helm/Terraform
├── infra/ # docker-compose, Traefik config, Terraform, Ansible, Helm
└── .github/workflows/ # CI/CD pipelines
```

Expand All @@ -52,7 +52,7 @@ The Spring Boot services and the GenAI service share a **PostgreSQL** database.
| GenAI Service | `/api/v1/helper/…` | 5000 | Python 3.12, Flask, LangChain |
| Web Client | `/` | 8080 | React, Vite |
| Swagger UI | `/docs` | 8080 | swaggerapi/swagger-ui |
| Traefik dashboard | `http://localhost:8080` | — | Traefik v3 |
| Traefik dashboard | `http://localhost:8080` (local only) | — | Traefik v3 |
| PostgreSQL | internal only | 5432 | postgres:15 |

## Code Generation
Expand Down Expand Up @@ -114,6 +114,122 @@ git push --no-verify
The full hook configuration lives in [`.pre-commit-config.yaml`](.pre-commit-config.yaml)
and the helper scripts under [`scripts/hooks/`](scripts/hooks/).

## Running Locally

Spin up the full stack on your machine with Docker Compose:

```bash
cd infra
docker compose up -d --build
```

This auto-merges [`infra/docker-compose.override.yml`](infra/docker-compose.override.yml),
which strips TLS / Let's Encrypt / Host-based routing from the base file so
everything is reachable on plain HTTP:

| URL | Service |
|---|---|
| <http://localhost/> | Web client |
| <http://localhost/docs> | Swagger UI |
| <http://localhost/api/v1/&lt;service&gt;/…> | APIs (organization, members, events, feedback, finance, letters, helper) |
| <http://localhost:8080> | Traefik dashboard |

> **Do not run** `docker compose -f infra/docker-compose.yml up` locally — that
> skips the override, causing Traefik to request a real Let's Encrypt cert for
> the production hostname from your laptop. Failed challenges count toward the
> production rate limit.

Tear down:
```bash
cd infra && docker compose down # keeps the postgres volume
cd infra && docker compose down -v # wipes the postgres volume too
```

## Production Deployment

The stack runs on a single Azure VM in **UAE North**, fronted by Traefik with a
real TLS certificate from Let's Encrypt (production CA). Everything is
automated; no manual VM access is required for normal deploys.

**Live URL:** <https://team-devoops.uaenorth.cloudapp.azure.com>

### Infrastructure stack

| Layer | Tool | What it does |
|---|---|---|
| Provisioning | **Terraform** (AzureRM ~> 4.0) | Resource group, VNet, NSG (22/80/443), static public IP + free Azure FQDN, Ubuntu 24.04 VM |
| Configuration | **Ansible** | Installs Docker, clones repo, writes `.env`, runs `docker compose up` |
| CI/CD | **GitHub Actions** (OIDC, no client secrets) | `infra.yml` (manual: plan/apply/destroy) and `cd.yml` (auto on push to `main`) |
| Remote state | **Azure Blob Storage** (`stteamdevoopstfstate/tfstate`) | Shared, locked Terraform state — survives between CI runs |
| TLS | **Let's Encrypt** (HTTP-01 via Traefik) | Cert persisted in a Docker volume; auto-renewed |

### GitHub Actions workflows

- **`infra` workflow** — manual (`workflow_dispatch`). Choose `plan`, `apply`, or `destroy`.
- **`cd` workflow** — runs automatically on every push to `main` (and is also `workflow_dispatch`-able). Deploys the current `main` to the VM via Ansible **and** to the Kubernetes cluster via Helm (see [Kubernetes deployment](#kubernetes-deployment-helm)).

### Required GitHub secrets / variables

| Kind | Name | Purpose |
|---|---|---|
| Variable | `AZURE_CLIENT_ID` | OIDC app registration (Service Principal) |
| Variable | `AZURE_TENANT_ID` | Azure AD tenant |
| Variable | `AZURE_SUBSCRIPTION_ID` | Target subscription |
| Secret | `VM_SSH_PUBLIC_KEY` | Public key planted on the VM by Terraform |
| Secret | `SSH_PRIVATE_KEY` | Matching private key for Ansible to SSH in |
| Secret | `VM_HOST` | Host Ansible connects to — use the FQDN above |
| Secret | `GENAI_ENV_CONTENT` | Contents of `services/py-genai-helper/.env` |
| Secret | `KUBECONFIG` | Kubeconfig for the RKE2 cluster (used by the `deploy-k8s` job) |

The OIDC service principal needs `Contributor` on the subscription (to manage
resources in `rg-team-devoops`) and `Storage Blob Data Contributor` on the
state account `stteamdevoopstfstate` (to read/write tfstate).

### Typical workflow

1. **Change infra** → push to `main` (or any branch), trigger `infra` workflow with `apply`.
2. **Change app code** → merge to `main`; `cd` runs automatically and redeploys.
3. **Tear down** → trigger `infra` workflow with `destroy`.

### Running Terraform locally

```bash
az login
az account set --subscription <AZURE_SUBSCRIPTION_ID>
export ARM_SUBSCRIPTION_ID=<AZURE_SUBSCRIPTION_ID>
export ARM_USE_AZUREAD=true
cd infra/terraform
echo "admin_ssh_public_key = \"$(cat ~/.ssh/team-devoops-azure.pub)\"" > terraform.tfvars
terraform init
terraform plan
```

Local and CI share the same remote state, so do not run `apply` in both at the
same time (the backend's blob lease will block one, but coordinate anyway).

### Kubernetes deployment (Helm)

In addition to the Azure VM, the stack is also deployed to a **Kubernetes
cluster** (TUM RKE2) via a Helm umbrella chart. Both deploys run in parallel on
push to `main`; the VM path is unchanged.

| Aspect | Value |
|---|---|
| Chart | [`infra/helm/team-devoops`](infra/helm/team-devoops) |
| Namespace | `ge83mom-devops26` |
| Host | <https://ge83mom-devops26.stud.k8s.aet.cit.tum.de> |
| Ingress | cluster `nginx` ingress (path-prefix routing, prefix stripped per service) |
| Images | built and pushed to `ghcr.io/aet-devops26/team-devoops/<service>` |
| Database | in-cluster PostgreSQL `StatefulSet` + PVC (cluster default StorageClass) |

The `cd` workflow's `docker-push` job builds and pushes all service images to
ghcr (tagged with the commit SHA), then `deploy-k8s` runs `helm upgrade
--install` against the cluster. On pull requests, the `ci` workflow's
`helm-validate` job lints and schema-validates the chart with `kubeconform`.

See [`infra/helm/README.md`](infra/helm/README.md) for the chart layout, required
one-time secrets (`genai-env`, `ghcr-pull`), and manual deploy instructions.

## Docs

- [Problem Statement](docs/problem-statement.md)
Expand Down
9 changes: 9 additions & 0 deletions api/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Bakes the OpenAPI spec into the Swagger UI image so it can be pulled like any
# other service (no runtime volume/ConfigMap needed).
FROM swaggerapi/swagger-ui:latest

# Served by swagger-ui from this path; SWAGGER_JSON points at it.
COPY openapi.yaml /app/openapi.yaml

ENV SWAGGER_JSON=/app/openapi.yaml
ENV BASE_URL=/docs
Empty file removed infra/.gitkeep
Empty file.
Loading
Loading