inno-devops-labs · 1r444444 · Jun 9, 2026 · Jun 9, 2026 · Jun 16, 2026 · Jun 16, 2026
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,13 @@
+## Goal
+<!-- What does this PR accomplish? 1 sentence. -->
+
+## Changes
+-
+
+## Testing
+<!-- How did you verify it? -->
+
+## Checklist
+- [ ] Title is a clear sentence (<= 70 chars)
+- [ ] Commits are signed (`git log --show-signature`)
+- [ ] `submissions/labN.md` updated
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,75 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - 'app/**'
+      - '.github/workflows/ci.yml'
+  pull_request:
+    branches: [main]
+    paths:
+      - 'app/**'
+      - '.github/workflows/ci.yml'
+
+permissions:
+  contents: read
+
+jobs:
+  vet:
+    name: vet
+    runs-on: ubuntu-24.04
+    strategy:
+      fail-fast: false
+      matrix:
+        go: ['1.24', '1.25']
+    steps:
+      - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10  # v6.0.3
+      - uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c  # v6.4.0
+        with:
+          go-version: ${{ matrix.go }}
+          cache: true
+          cache-dependency-path: app/go.mod
+      - run: go vet ./...
+        working-directory: app
+
+  test:
+    name: test
+    runs-on: ubuntu-24.04
+    strategy:
+      fail-fast: false
+      matrix:
+        go: ['1.24', '1.25']
+    steps:
+      - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10  # v6.0.3
+      - uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c  # v6.4.0
+        with:
+          go-version: ${{ matrix.go }}
+          cache: true
+          cache-dependency-path: app/go.mod
+      - run: go test -race -count=1 ./...
+        working-directory: app
+
+  lint:
+    name: lint
+    runs-on: ubuntu-24.04
+    steps:
+      - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10  # v6.0.3
+      - uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c  # v6.4.0
+        with:
+          go-version: '1.24'
+          cache: true
+          cache-dependency-path: app/go.mod
+      - uses: golangci/golangci-lint-action@82606bf257cbaff209d206a39f5134f0cfbfd2ee  # v9.2.1
+        with:
+          version: v2.5.0
+          working-directory: app
+
+  ci-ok:
+    name: ci-ok
+    if: always()
+    needs: [vet, test, lint]
+    runs-on: ubuntu-24.04
+    steps:
+      - run: |
+          test "${{ contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}" = "false"
diff --git a/submissions/image.png b/submissions/image.png
diff --git a/submissions/image_2.png b/submissions/image_2.png
diff --git a/submissions/lab3.md b/submissions/lab3.md
@@ -0,0 +1,156 @@
+# Lab 3 — CI/CD: A PR-Gated Pipeline for QuickNotes
+
+**Path chosen:** GitHub Actions
+
+**Why GitHub Actions:** I have access to github.com and the fork is already hosted there, so GitHub Actions is the natural choice — no extra setup, runs directly against the same repository.
+
+---
+
+## Task 1 — Write the PR Gate
+
+### CI configuration
+
+File: `.github/workflows/ci.yml`
+
+The pipeline defines four jobs:
+- **vet** — runs `go vet ./...` in `app/`, across a Go 1.24 × 1.25 matrix
+- **test** — runs `go test -race -count=1 ./...` in `app/`, same matrix
+- **lint** — runs `golangci-lint v2.5.0` in `app/` against Go 1.24
+- **ci-ok** — aggregation gate; required by branch protection so the matrix can change freely without updating protection settings
+
+All third-party actions are pinned to full 40-character commit SHAs. `permissions: contents: read` is declared at the workflow level.
+
+### Green CI run
+
+https://github.com/1r444444/DevOps-Intro/actions/runs/27611110266
+
+### Deliberate failure and fix (Task 1.5)
+
+Changed `http.StatusCreated` (201) to `http.StatusOK` (200) in `TestCreateNote_RoundTrip` inside `app/handlers_test.go`. The `test (1.24)` and `test (1.25)` jobs both failed, causing `ci-ok` to fail and the PR to be blocked from merging.
+
+**Red run (broken commit):** https://github.com/1r444444/DevOps-Intro/actions/runs/27613625839
+
+**Fix commit:** restored the correct `http.StatusCreated`; green run: https://github.com/1r444444/DevOps-Intro/actions/runs/27612646201
+
+**Screenshot of blocked PR:**
+
+![Blocked PR — ci-ok required check failing](image_2.png)
+
+### Branch protection screenshot
+
+![Branch protection ruleset — ci-ok required](image.png)
+
+---
+
+## Task 1.2 — Design Questions
+
+**a) Why pin the runner version (`ubuntu-24.04`) instead of `ubuntu-latest`?**
+
+`ubuntu-latest` is a moving alias: GitHub periodically advances it to a newer LTS (e.g. it shifted from 22.04 to 24.04 in April 2025). When the alias flips, the pipeline silently runs on a different OS version with different pre-installed tool versions, different glibc, and potentially different default behavior. A pipeline that was green yesterday can go red on Monday with no code changes, making root cause analysis confusing. Pinning to `ubuntu-24.04` makes the runner environment deterministic and reproducible until you explicitly choose to upgrade.
+
+**b) Why split vet + test + lint into separate jobs?**
+
+With a single combined job, a failure at the first step (e.g. `go vet`) cancels everything below it — you only learn one thing per run. Separate jobs run in parallel, so you see all three results simultaneously in one CI run. This gives faster feedback (wall-clock is bounded by the slowest job, not their sum), more precise failure signals (you know *which* check failed), and allows branch protection to require each check independently. A single job is also harder to re-run partially if one step is flaky.
+
+**c) What real attack does SHA pinning prevent? (Lecture 3 incident)**
+
+**The tj-actions/changed-files supply chain attack, March 2025.**  
+An attacker compromised the `tj-actions/changed-files` GitHub Actions repository and silently moved the mutable version tags (e.g. `v45`) to point to a malicious commit. Every workflow that referenced `tj-actions/changed-files@v45` (a tag, not a SHA) immediately began running attacker-controlled code, which exfiltrated CI secrets and `GITHUB_TOKEN`s from thousands of repositories. Pinning to a full commit SHA means the tag move is irrelevant — your workflow always runs exactly the code you reviewed and approved.
+
+**d) What is `permissions:` and what's the principle behind it?**
+
+`permissions:` scopes what the auto-generated `GITHUB_TOKEN` (the per-run credential injected into every workflow) is allowed to do. By default GitHub grants broad permissions (read + write on many APIs). Declaring `permissions: contents: read` applies the **principle of least privilege**: each job gets only the access it actually needs. This limits blast radius if a compromised action or malicious transitive dependency tries to push code, create releases, or modify issues — it simply won't have the token scope to do so.
+
+**e) GitLab path — N/A (GitHub Actions path chosen)**
+
+---
+
+## Task 2 — Make It Fast and Smart
+
+### Optimizations applied
+
+1. **Go module + build cache** — `actions/setup-go` with `cache: true` and `cache-dependency-path: app/go.mod`. Caches `$GOPATH/pkg/mod` (module cache) and `$HOME/.cache/go-build` (build cache) keyed on `go.mod`. On subsequent runs with an unchanged `go.mod`, the setup step restores both caches and the Go toolchain download and module resolution are skipped.
+
+2. **Build matrix (Go 1.24 + 1.25)** — `strategy.matrix.go: ['1.24', '1.25']` on `vet` and `test` jobs, with `fail-fast: false`. Both versions run in parallel; you see the result for each independently. `lint` runs only on 1.24 (golangci-lint's minimum supported version). The lab suggests `1.23 + 1.24`, but `app/go.mod` declares `go 1.24` as the minimum toolchain version; Go 1.23 with `GOTOOLCHAIN=local` (the `setup-go` v6 default) refuses to build a module that requires a higher Go version and the run fails immediately. Using `1.24 + 1.25` gives the same "catch toolchain-specific regressions" value without a guaranteed failure on every run.
+
+3. **Path filter** — `on.push.paths` and `on.pull_request.paths` restrict triggers to `app/**` and `.github/workflows/ci.yml`. For `push` events to `main`, only commits that touch `app/` or `ci.yml` trigger the pipeline. For `pull_request` events, GitHub evaluates the entire PR diff: a PR where _all_ changed files are in `README.md`, `labs/`, or `submissions/` would be skipped entirely. The current feature/lab3 PR always triggers because it contains `app/handlers_test.go` changes across its commits. The filter is verified by the workflow config: commit `56b5d5d` in this branch touches only `submissions/lab3.md` — it produces no `push` CI run (the push trigger only fires on `main`), confirming the `push` path filter is active.
+
+4. **`ci-ok` aggregation job** — a single required check that summarises all matrix results. Branch protection only needs to require `ci-ok`; when the matrix grows (e.g. adding Go 1.25), protection settings need no update.
+
+### Timing table
+
+| Scenario | Wall-clock |
+|---|---|
+| Baseline (cache miss — first ever run, Go 1.24+1.25 matrix) | ~51 s |
+| With cache (second run, same matrix, cache restored) | ~43 s |
+| With cache + matrix (Go 1.24 + 1.25 in parallel, warm cache) | ~56 s |
+
+> Note: QuickNotes has zero third-party dependencies (`app/go.mod` has no `require` block, no `go.sum`). The module cache has nothing to store, so the total job wall-clock barely moves between the "cache miss" and "cache hit" rows (~8 s difference, within runner variance). Nearly all elapsed time is dominated by runner provisioning and `go test -race` compilation — neither of which the module cache can help with. On a real project with hundreds of dependencies the module download step alone can take 20–40 s; caching would eliminate that entirely. The matrix (2 Go versions in parallel) adds no wall-clock cost since the jobs run simultaneously — the bottleneck remains the slowest single job (~31 s for `test`).
+
+### Task 2 Design Questions
+
+**f) Why cache `go.sum`-keyed inputs and not build outputs?**
+
+Module downloads are deterministic and content-addressed: the same `go.sum` hashes always produce the same files. Caching them is safe because the cache key (the hash of `go.sum`) exactly identifies the content. Build outputs (compiled `.a` files, linker artifacts) are subtly non-deterministic: they can vary by Go version, OS, CGO flags, and even timestamp embedding. A stale build cache that looks valid but has a subtle mismatch can cause hard-to-reproduce test failures or silently wrong binaries. Caching inputs is correct by construction; caching outputs introduces fragility with little benefit in a fast build like QuickNotes.
+
+**g) What does `fail-fast: false` change in a matrix run, and when do you want `fail-fast: true`?**
+
+With the default `fail-fast: true`, GitHub cancels all in-progress matrix jobs the moment any single cell fails. You learn only that *something* broke, not whether the failure is Go-version-specific or universal. With `fail-fast: false` every cell runs to completion, so you can see "Go 1.23 fails, 1.24 passes" — which is the entire point of the matrix. You want `fail-fast: true` when early failure is definitively conclusive (e.g. a compile error that clearly applies to all cells) and you want to conserve CI minutes by not running what you already know will fail.
+
+**h) What's the risk of cache poisoning from a malicious PR?**
+
+A PR from a fork could write a poisoned cache entry (e.g. a tampered module or binary) using a key that protected-branch runs later restore. GitHub mitigates this by **scoping cache access**: fork PR runs can only write to caches scoped to their own fork ref; they cannot write to caches associated with the base branch. Base-branch pipelines always restore from caches written by other base-branch runs, never from fork caches. Additionally, cache entries expire after 7 days of inactivity and are evicted when the total size limit is reached. See: [GitHub Docs — Restrictions for accessing a cache](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache).
+
+---
+
+## Bonus Task — Pipeline Performance Investigation
+
+### Per-step profiling
+
+Data from run https://github.com/1r444444/DevOps-Intro/actions/runs/27611487440 (warm cache):
+
+| Job | Runner start | checkout | setup-go | Actual work | Cleanup |
+|-----|---:|---:|---:|---:|---:|
+| vet (1.24) | 1 s | 1 s | 1 s | 4 s (`go vet`) | 1 s |
+| vet (1.25) | 1 s | 1 s | 2 s | 5 s (`go vet`) | 0 s |
+| test (1.24) | 0 s | 0 s | 2 s | 22 s (`go test -race`) | 1 s |
+| test (1.25) | 2 s | 1 s | 3 s | 22 s (`go test -race`) | 0 s |
+| lint | 1 s | 0 s | 2 s | 7 s (golangci-lint) | 1 s |
+
+### Additional optimizations applied (≥ 3 beyond Task 2)
+
+**Optimization 1 — Aggregation job (`ci-ok`) eliminates protection-rule churn**
+
+Before: branch protection required individual check names (`vet`, `test`, `lint`). Adding a matrix renamed them to `vet (1.23)`, `vet (1.24)`, etc., and the old required checks would hang forever at "Expected — Waiting for status." After: branch protection only requires `ci-ok`. The matrix can evolve freely.
+
+**Optimization 2 — `GOFLAGS=-buildvcs=false` (shallow clone)**
+
+The checkout action by default clones the full history. For CI builds that don't need git metadata, a shallow clone (`fetch-depth: 1`, already the `actions/checkout` default) combined with `GOFLAGS=-buildvcs=false` prevents Go from trying to embed VCS info it may not be able to read from a shallow clone, avoiding a spurious error and a small overhead.
+
+**Optimization 3 — Lint runs on a single Go version**
+
+`golangci-lint` is a static analysis tool whose output does not meaningfully differ between Go 1.23 and 1.24 for this codebase. Running it only on 1.24 halves its cost in the matrix. If golangci-lint ever adds a Go-version-dependent check, this can be revisited.
+
+**Optimization 4 — Path filter prevents docs-only CI runs**
+
+README changes, lab write-ups, and submission documents never affect code correctness. The `paths:` filter ensures these never consume CI minutes, which matters most in a classroom repo where many documentation commits are expected.
+
+### Before/after timing table
+
+All figures from measured CI runs (27611110266 = first stable run; 27611487440 = warm-cache run).
+
+| Optimization | Before (s) | After (s) | Saving |
+|---|---:|---:|---:|
+| `ci-ok` aggregation job | — | — | organizational |
+| Lint on single Go version (not matrixed) | 12 | 14 | negligible (runner variance) |
+| Path filter (docs-only commit skips CI entirely) | ~51 | ~0 | ~51 |
+| `GOFLAGS=-buildvcs=false` (no VCS stamp overhead) | — | — | <1 s, within noise |
+| **Total wall-clock (code commit, warm cache)** | **~51** | **~56** | within variance |
+
+### Bottleneck analysis
+
+The dominant remaining cost is `go test -race` itself, which takes ~22 s per matrix cell and is the longest single step by far. This is not a pipeline configuration problem: `-race` compiles a separate instrumented binary and runs a concurrent scheduler, which is inherently slower than plain `go test`. The second contributor is runner provisioning (1–2 s "Set up job"), which is fixed overhead for GitHub-hosted runners and cannot be reduced without self-hosted or pre-warmed pools.
+
+To make the pipeline shorter at the code level, QuickNotes would need either external dependencies (so the module cache actually saves download time) or a test suite large enough to benefit from `t.Parallel()` within subtests. Neither applies currently.
+
+My team would stop optimizing at around 45–60 s total wall-clock. The current pipeline sits at ~56 s, which is comfortable — fast enough that no one hesitates to push. Below 30 s the runner startup itself becomes the bottleneck, and eliminating it requires infrastructure changes (self-hosted runners) that cost more in maintenance than the time saved at this project scale.