Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions plugins/kbagent/skills/kbagent-cicd-migration/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
---
name: kbagent-cicd-migration
description: >
Use when migrating an existing kbc (keboola-as-code) GitHub CI/CD pipeline to
the new kbagent (keboola-agent-cli) sync engine. Covers: converting per-project
pull/push PR workflows, multi-project repos (e.g. L0/L1 dev->prod promotion),
branch->environment mapping, GitHub secrets/variables/environments setup, the
install step (uv tool install instead of downloading a Go binary), and the
kbc->kbagent command/flag/env-var mapping. Triggers: migrate CI/CD, migrate
pipeline, kbc to kbagent, port GitHub Actions, CLI-based-sync-demo, kbc pull
push CI, project-as-code CI migration, replace kbc binary in CI, gitops
migration, multi-project promotion, dev to prod Keboola, KBC_STORAGE_API_TOKEN
to KBC_TOKEN, sync push CI, sync pull CI.
---

# kbc -> kbagent CI/CD Migration

Guides a customer through porting a `kbc` GitHub CI/CD pipeline (the
[CLI-based-sync-demo](https://github.com/keboola/CLI-based-sync-demo) shape:
per-project pull/push, multi-project promotion, branch-gated deploys) to the new
`kbagent sync` engine, emitting **clean kbagent-native workflows**.

## Reality check — this is a one-time BREAKING migration, not a command swap

Do **not** tell the user they can just swap `kbc` for `kbagent` in place. Three
hard incompatibilities make this a deliberate cutover (verified against the code):

1. **The on-disk layout/format is different and incompatible.**
- `kbc` writes per config: `config.json` + `meta.json` + `description.md` (JSON).
- `kbagent` writes per config: **`_config.yml`** (YAML, with `name`/`description`/
`parameters` hoisted + a `_configuration_extra` block) + extracted code files
(`constants.py:425`, `sync/config_format.py`).
- The first `kbagent sync pull` therefore **rewrites every configuration** into a
new format. The old `config.json`/`meta.json` files are **not read** by kbagent
and become orphans that must be deleted. Expect a **massive reformatting diff**.
2. **kbagent sync is an ORCHESTRATOR, not cwd-per-folder.** `kbc pull` runs against
whatever directory you `cd` into. `kbagent sync pull` *requires* `--project ALIAS`
(resolved from a central config store) or `--all-projects` (`sync.py:67,495`). In
CI we bridge this with env-injection: `KBAGENT_PROJECT_FROM_ENV=1` synthesizes a
project under the reserved alias `__env__`, and every command passes
`--project __env__ --directory <folder>`.
3. **The two tools cannot co-own the same tree.** Because the source-of-truth files
differ, you cannot have `kbc push` and `kbagent push` both treating one directory
as canonical. You must cut over.

### Consequence (this is what the user correctly anticipated)
- The migration is a **one-time conversion commit** on a **dedicated branch**, where
the JSON tree is replaced by the YAML tree. Reviewing that diff line-by-line is
impractical; you verify by **behavior** (`sync diff` clean, dry-run push empty), not
by reading the reformat. Merging it to `main` is effectively a tooling version bump.
- Until cutover, keep the legacy `kbc` workflows; afterwards delete them in the same PR.

> **Live-verified (project 153, GCP europe-west3, 2026-06):**
> - `KBAGENT_PROJECT_FROM_ENV=1` + `--project __env__` works for `init` and `pull`.
> - kbc pulled **143 `config.json` + 187 `meta.json`** (JSON); kbagent pulled
> **147 `_config.yml`** (YAML). Zero overlap in file format.
> - `sync init --adopt-existing` on a **kbc-produced tree** succeeds, but the very
> next `sync diff` reports **`0 to create, 0 to update, 136 to delete`** — kbagent
> does not read kbc's `config.json` at all, so it sees every existing config as a
> local deletion. **A `sync push --allow-delete` here would delete all 136 remote
> configs.** Adopt-existing adopts only the manifest, NOT the configs.
Comment on lines +60 to +61
> - The correct path — adopt → `sync pull --force` (writes `_config.yml`) → `git rm`
> the orphaned `config.json`/`meta.json` — converged the diff from 136 deletes to
> ~2 (plus ~9 remote-only scheduler/variables configs that pull/diff treat
> inconsistently — verify these per project). Both file sets coexist after pull
> until you delete the kbc files, so the `git rm` step is mandatory, not optional.
>
> **Operator rule:** never run `sync push` against an adopted-but-not-yet-pulled kbc
> tree. Always `sync pull --force` first, confirm `sync diff` is clean, THEN enable
> the push lane.

### What still carries over unchanged
The *orchestration shell* is CLI-agnostic: manual/scheduled pull that commits state,
PR validation, GitHub-Environment-gated push, branch→env mapping, per-project loops.
Only three mechanics change: **install** (`uv tool install` not a binary download),
**commands/flags** (`kbagent sync ...`, see
[references/command-mapping.md](references/command-mapping.md)), and **auth env vars**
(`KBAGENT_PROJECT_FROM_ENV=1` + `KBC_TOKEN` + `KBC_STORAGE_API_URL`).

## Workflow

### Step 1 — Analyze the existing repo (always start here)
Run the engine in dry-run mode to inventory projects and the legacy CI it replaces:

```bash
python <skill_dir>/scripts/migrate_cicd.py /path/to/repo
```

It prints every project found (one per `.keboola/manifest.json`), each project's
id / stack host / required token secret name, any `ignoredComponents` ("subset of
a project" — see Step 5), and the legacy `kbc` workflow/action files it supersedes.

### Step 2 — Pick a version pin (decide before generating)
- **Pinned (recommended for prod lanes):** `--version 0.58.0` (PyPI, once published)
or `--git-ref v0.58.0` (git tag, until PyPI exists). Reproducible CI.
- **Unpinned (`keboola-agent-cli`, resolves to latest):** only acceptable for a
non-prod/scratch lane. Warn the user: unpinned + the current auto-update behavior
means non-deterministic CI runs.

### Step 3 — Generate the clean workflows
```bash
python <skill_dir>/scripts/migrate_cicd.py /path/to/repo --write \
--version 0.58.0 --main-branch main --schedule "0 * * * *"
```
Produces:
- `.github/workflows/kbagent-validate.yml` — on PR: `sync diff` + `sync push --dry-run` per project (read-only drift + secret-encryption preflight).
- `.github/workflows/kbagent-pull.yml` — manual + optional cron: `sync pull --force` per project, commits state back.
- `.github/workflows/kbagent-push.yml` — manual, **GitHub-Environment-gated**: `sync push` per project, with an `allow_delete` input.

The legacy `kbc` files are **left in place** — review the new ones, then delete the
old workflows/actions in the same PR.

### Step 3b — Perform the one-time config conversion (dedicated branch)
This is the breaking part. On a fresh migration branch, for each project, convert
the JSON tree to kbagent's YAML tree and remove the orphaned kbc files:

```bash
git checkout -b migrate/kbc-to-kbagent
# Per project (do ONE non-prod project first and verify):
KBAGENT_PROJECT_FROM_ENV=1 KBC_TOKEN=$L0_TOKEN KBC_STORAGE_API_URL=https://connection.keboola.com \
kbagent sync init --adopt-existing --project __env__ --directory L0
KBAGENT_PROJECT_FROM_ENV=1 KBC_TOKEN=$L0_TOKEN KBC_STORAGE_API_URL=https://connection.keboola.com \
kbagent sync pull --project __env__ --directory L0
# Remove orphaned kbc JSON files that kbagent no longer reads:
find L0 -name config.json -o -name meta.json | xargs git rm --cached --ignore-unmatch
git add -A
```
Verify by **behavior**, not by reading the reformat diff: a follow-up
`sync diff --project __env__ -d L0` must be clean and `sync push --dry-run` empty.
Only then repeat for the remaining projects and the production lane.

### Step 4 — Set up GitHub secrets, variables, environments
The engine prints exact `gh` commands. The model: **one Storage API token secret
per project** (`KBC_TOKEN_<ALIAS>`), and two **Environments** (`prod`, `dev`) so
prod pushes require approval. See [references/secrets-setup.md](references/secrets-setup.md)
for the full mapping from the old `secrets.KBC_SAPI_TOKEN_*` / `vars.KBC_*` scheme.

### Step 5 — Confirm scope ("subset of a project") and branching
- **Subset:** if a project should only sync part of its config tree, set
`ignoredComponents` (and/or `allowedBranches`) in that project's
`.keboola/manifest.json` — `kbagent sync` honors both, exactly like `kbc`.
- **Branching:** the old model used a fixed branch id per env. The new model maps
git branch → Keboola dev branch via `.keboola/branch-mapping.json` +
`kbagent sync branch-link`. For PR-based promotion this is usually *better*:
a PR branch links to a Keboola dev branch, `main` pushes to production. Add
`--git-branching` to annotate, and walk the user through `branch-link` if they
want per-PR isolated dev branches. If they want to keep the simple single-branch
(production) model, leave branch-mapping at the default (null = production).

### Step 6 — Validate before merging
- Open the migration PR; the `kbagent-validate` workflow runs `sync diff` — confirm
the diff is empty (no unintended drift) against each project.
- Manually run `kbagent-pull` once and confirm the committed state matches what
`kbc pull` produced (the layout is identical; `git diff` should be tiny — mostly
YAML vs JSON config-body formatting differences if any).
Comment on lines +153 to +155
- Do a `kbagent-push` dry-run (the validate workflow already does this) and read
the planned changes before the first real gated push.

## Guardrails (state these to the user)
- **Never** add `--allow-plaintext-on-encrypt-failure` to CI push — it silently
uploads `#`-secrets in cleartext if the Encryption API is down. The generated
push is fail-closed by design.
- `sync push --allow-delete` deletes remote configs removed locally. It is wired to
the `allow_delete` workflow input (default off). Treat it like the old `--force`.
Comment on lines +163 to +164
- Tokens live **only** in GitHub secrets and are injected as env vars per step; the
generated workflows never write a `config.json` to disk.

## Reference material
- [references/migration-runbook.md](references/migration-runbook.md) — **the ordered PR sequence / cutover plan** (pre-flight → conversion PR → start-over). Use this when the user asks "which PRs, what order, how do I cut over."
- [references/branching-model.md](references/branching-model.md) — **how to choose** single-branch (Model A) vs git-branching (Model B), with a decision table.
- [references/command-mapping.md](references/command-mapping.md) — kbc ↔ kbagent commands, flags, env vars.
- [references/secrets-setup.md](references/secrets-setup.md) — secrets/vars/environments migration table + `gh` setup.
- `scripts/migrate_cicd.py` — the analyzer + generator (stdlib only).
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Choosing a branching model

kbagent supports two ways to map your git repo to Keboola branches. Pick one up
front — it changes how `sync init` is run and what `push` touches.

## The two models

### Model A — Single-branch / production-direct
- Each project directory maps to **one Keboola project's production branch**.
- `.keboola/branch-mapping.json` stays at its default (`null` = production).
- `sync pull` / `sync push` read and write that project's production branch directly.
- Promotion across environments (dev → prod project) is a **git merge** between
project directories/repos, then a `push` to the next project (the demo's L0→L1
shape).

**Choose A when:**
- You're experimenting, or driving a single dev project locally instead of kbc.
- Your "environments" are *separate Keboola projects* (L0/L1), not dev branches.
- You want the simplest mental model and fewest moving parts.

**Trade-off:** a `push` writes straight to the production branch of that project —
there is no isolated staging copy inside Keboola. Review happens in git, not in KBC.

### Model B — Git-branching (Keboola dev-branch isolation)
- `sync init --git-branching` creates `.keboola/branch-mapping.json`.
- Each **git branch** links to a **Keboola development branch** (an isolated server-
side copy) via `kbagent sync branch-link --branch-name <git-branch>`.
- Work on a PR branch → `push` lands in its Keboola dev branch (safe, isolated);
merge to `main` → `push` lands in production.
- `kbagent sync branch-status` shows the mapping; `branch-unlink` detaches.

**Choose B when:**
- Multiple people open PRs against the same project and you want each change tested
in isolation inside Keboola before it hits production.
- You already use Keboola's development-branches feature.
- You want PRs to never write production directly.

**Trade-off:** more lifecycle to manage (create/link/unlink dev branches, clean them
up), and the mapping file is per-clone state.

## Decision shortcut

| Your situation | Model |
|---|---|
| "I just want to manipulate one project with kbagent instead of kbc" | **A** (start here) |
| Separate dev/prod **projects** promoted by git merge (L0/L1) | **A** |
| PR-per-change, multiple contributors, want isolated server-side testing | **B** |
| You rely on Keboola development branches today | **B** |

You can start on **A** and adopt **B** later: run `sync init --git-branching` and
`branch-link` when you actually need per-PR isolation. Moving A→B is additive (it adds
a mapping file); it does not require re-converting the config tree.

## How the model shows up in commands

```bash
# Model A (production-direct) — nothing special:
kbagent sync pull --project <alias> -d <dir>
kbagent sync push --project <alias> -d <dir>

# Model B (git-branching):
kbagent sync init --git-branching --project <alias> -d <dir>
git checkout -b feature/x
kbagent sync branch-link --project <alias> -d <dir> --branch-name feature/x
kbagent sync pull/push --project <alias> -d <dir> # now targets the dev branch
kbagent sync branch-status --project <alias> -d <dir>
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# kbc ↔ kbagent command / flag / env mapping

Authoritative mapping used by the migration generator. Verify flags against your
installed `kbagent` version (`kbagent sync pull --help`); the new CLI evolves fast.

## Install

| kbc (old) | kbagent (new) |
|---|---|
| Download Go binary zip from `keboola/keboola-as-code` GitHub release, unzip to `/usr/local/bin/kbc` | `uv tool install keboola-agent-cli==<ver>` (PyPI) or `uv tool install 'git+https://github.com/keboola/cli@<tag>'` |
| `kbc --version` | `kbagent version` |
| Custom `install` composite action | `astral-sh/setup-uv@v5` + one `uv tool install` line |

## Core sync commands

| kbc (old) | kbagent (new) | Notes |
|---|---|---|
| `kbc init -d DIR --allow-target-env` | `kbagent sync init --directory DIR [--adopt-manifest]` | `--adopt-manifest` reuses an existing `.keboola/manifest.json` written by `kbc` |
| `kbc persist -d DIR` | *(folded into `sync pull`)* | No separate persist step; pull writes manifest + new objects |
| `kbc pull -d DIR --force` | `kbagent sync pull --directory DIR --force` | `--force` overrides local-vs-remote conflicts (3-way diff) |
| `kbc push -d DIR` | `kbagent sync push --directory DIR` | Encrypts `#`-secrets fail-closed before write |
| `kbc push -d DIR --force` | `kbagent sync push --directory DIR --allow-delete` | `--allow-delete` removes remote configs deleted locally |
| `kbc push --dry-run` / push-dry action | `kbagent sync push --dry-run --directory DIR` | Shows planned changes without writing |
| `kbc diff -d DIR` | `kbagent sync diff --directory DIR [--json]` | `--json` gives structured drift for CI gating |
| `kbc status` | `kbagent sync status --directory DIR` | |
| `kbc validate` (JSON-schema) | *(no direct equivalent — gap)* | Use `sync diff` for drift; schema validation is not ported |

## Auth / environment variables

| kbc (old) | kbagent (new) | Notes |
|---|---|---|
| `KBC_STORAGE_API_TOKEN` | `KBC_TOKEN` | Storage API token |
| `KBC_STORAGE_API_HOST` (bare host) | `KBC_STORAGE_API_URL` (full URL) | `connection.keboola.com` → `https://connection.keboola.com` |
| *(implicit)* | `KBAGENT_PROJECT_FROM_ENV=1` | **Required** opt-in so kbagent synthesizes an ephemeral project from the env in CI (no `config.json` on disk). See `constants.py:163`, `config_store.py:193` |
| `KBC_PROJECT_ID`, `KBC_BRANCH_ID`, `KBC_BRANCHES` | *(from manifest + branch-mapping)* | Project id comes from `.keboola/manifest.json`; branch from `.keboola/branch-mapping.json` |

## Branching

| kbc (old) | kbagent (new) |
|---|---|
| Fixed `KBC_BRANCH_ID` per env; `allowedBranches` in manifest | `.keboola/branch-mapping.json` (git branch → Keboola branch id; `null` = production) managed by `kbagent sync branch-link / branch-unlink / branch-status` |
| Branch dir under repo (`main/`) | Same on-disk layout; mapping decides which Keboola branch a git branch targets |

## Subset of a project

Both CLIs honor manifest-level scoping — no command change needed:

- `allowedBranches: ["<id>"]` — restrict which branches sync.
- `ignoredComponents: ["keboola.foo", ...]` — exclude component types.

`kbagent` parses both (`sync/manifest.py:120`). Additionally, `sync pull` flags
`--skip-storage` / `--skip-jobs` / `--with-table-samples` control how much
*metadata* (beyond configs) is pulled — orthogonal to the config subset.
Comment on lines +51 to +53

## What has NO clean port (call out to the user)
- `kbc validate` JSON-schema validation.
- `kbc ci workflows` generator itself (this skill replaces it).
- Templates / dbt / CI-scaffold subsystems (`kbc template`, `kbc dbt`) — keep `kbc`
for those; they are out of scope for sync CI/CD.
Loading
Loading