Skip to content

feat(encrypt): --json plaintext_written field across secret-write paths (GHSA-7jrf follow-up, code-only)#465

Open
padak wants to merge 4 commits into
mainfrom
fix/plaintext-written-structured
Open

feat(encrypt): --json plaintext_written field across secret-write paths (GHSA-7jrf follow-up, code-only)#465
padak wants to merge 4 commits into
mainfrom
fix/plaintext-written-structured

Conversation

@padak

@padak padak commented Jun 23, 2026

Copy link
Copy Markdown
Member

Summary

Completes the M7 follow-up (private advisory GHSA-7jrf-xc86-8wf6). The merged #463 made the plaintext-on-encrypt-failure warning loud on stderr and named the leaked keys. This adds the structured plaintext_written field to every secret-write result so --json consumers (AI agents, scripts) see the leak in the envelope, not only on stderr.

What it does

plaintext_written is a list of the secret key-paths still in plaintext after a write (key-paths only, never the values; [] when encryption succeeded). The detection is gate-free: find_plaintext_secret_keys(<final_config>) after the encrypt step returns exactly the leaked keys ([] on success; it raises before returning when fallback isn't allowed).

  • config_service: update_config, create_config_row, create_config, update_config_row — added to the real-write result dict (dry-run paths return earlier and intentionally keep plaintext for a readable diff, so they're untouched — no false positives).
  • variables_service: set_variables — threaded out through the encrypt helpers (_build_encrypted_row_configuration now returns (config, leaked_keys); the private _create/_update_linked_variables return tuples widened accordingly — verified no external callers).
  • data_app_service: secrets-set — surfaces the existing problems set (keys whose ciphertext check failed).
  • Commands (config update/new/row-create/row-update, config variables-set, data-app secrets-set): a human-mode formatter.warning naming the keys + remediation. --json already carries the field via the result envelope.

Also: the config no-project_id plaintext branch now names the leaked keys too — so no plaintext-write path stays silent.

Tests

test_config_encryption.py (TestPlaintextWrittenField, 7 tests) + test_variables_service.py (TestSetVariablesEncryption, 3 tests): assert plaintext_written == [] on successful encryption and lists the exact leaked key-paths after a fallback (e.g. #parameters.#password, #values.[0].#api_token), and explicitly assert the secret value never appears in the field. Full suite green: 4192 passed, 135 skipped; lint/format/ty clean.

⚠️ Code-only PR

No version bump / changelog — added at the next release alongside #422/#460/#461/#462/#463.

Audit

This completes the broad M7 follow-up. With it, the entire 2026-06-12 security audit is fully closed (10 fixed + 2 owner-accepted residuals).


Open in Devin Review

…_written field (GHSA-7jrf follow-up)

The #463 central fix made the plaintext-on-encrypt-failure warning loud on
stderr and naming the leaked keys. This adds the structured `plaintext_written`
field (list of leaked secret key-PATHS, never values; [] when encryption
succeeded) to the result of every secret-write path so --json consumers (AI
agents, scripts) see the leak in the envelope, not just on stderr:

- config_service: update_config / create_config_row / create_config /
  update_config_row (find_plaintext_secret_keys on the real-write config;
  dry-run paths return earlier and are untouched).
- variables_service: set_variables (threaded through the encrypt helpers).
- data_app_service: secrets-set (the existing `problems` failed-cipher keys).
- Commands surface it as a human-mode formatter.warning naming the keys +
  remediation; --json already carries the field via the result envelope.

Also: the config no-project_id plaintext branch now names the leaked keys too,
so NO plaintext-write path stays silent.

Code-only (no version bump / changelog) per the conflict-immunity rationale;
version + changelog at next release.

Private advisory GHSA-7jrf-xc86-8wf6.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

Open in Devin Review

Comment thread src/keboola_agent_cli/services/variables_service.py Outdated

@padak padak left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #465 — feat(encrypt): --json plaintext_written field across secret-write paths

Generated by kbagent-pr-reviewer subagent. Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

The PR completes the M7 follow-up to GHSA-7jrf: it adds a plaintext_written field (list of leaked key-paths, never values) to the --json result envelope of every secret-write command (config update/new/row-create/row-update, config variables-set, data-app secrets-set), so --json consumers see a plaintext-on-encrypt-failure fallback structurally rather than only on stderr. The overall approach is sound and the service tests are thorough. Three findings: one NON-BLOCKING rule deviation (new tuple return on _build_encrypted_row_configuration) and two NON-BLOCKING documentation gaps (human-mode warning not covered by a CliRunner test; gotchas.md not updated with the new plaintext_written field). No security issues, no layer violations, no OPERATION_REGISTRY gaps (no new commands), no backward-compat breaks.

Verdict: APPROVE — all findings are NON-BLOCKING.

Verdict

  • Verdict: APPROVE
  • Blocking findings: 0
  • Non-blocking findings: 3
  • Nits: 1

Blocking findings

(none)

Non-blocking findings

[NB-1] src/keboola_agent_cli/services/variables_service.py:392_build_encrypted_row_configuration changed from a single-value return to a new tuple return

_build_encrypted_row_configuration previously returned dict[str, Any]. This PR changes it to tuple[dict[str, Any], list[str]]. CONTRIBUTING.md (Code Quality Patterns > "Return values") states: "existing tuple[...] returns in services are grandfathered, but do not add new ones." The method was not a tuple-returning method before — this is a new tuple return, not a widening of an existing one.

The PR correctly notes that the two callers _create_linked_variables and _update_linked_variables are already tuple-returning methods (and are just widened, which falls under the grandfathering rule). However _build_encrypted_row_configuration itself was not. The fix is to extract find_plaintext_secret_keys(row_config) at the call sites instead of inside the helper, keeping _build_encrypted_row_configuration returning dict[str, Any].

[NB-2] tests/test_config_encryption.py — no CliRunner test for human-mode Warning: output from _emit_plaintext_written_warning

The PR adds _emit_plaintext_written_warning(formatter, result) to five locations in commands/config.py and one in commands/_data_app_runtime.py. All CLI-layer tests in TestCliFlagWiring use --json mode and a MagicMock service — they verify the flag is forwarded, not that the warning text actually appears on stderr in human mode. There is no CliRunner test that runs without --json, stubs the service to return {"plaintext_written": ["#parameters.#password"]}, and asserts "were written in PLAINTEXT" appears in the output. Per CONTRIBUTING.md, CLI-layer tests must test both JSON output and human output. The human-mode path is the safety net that warns operators who run kbagent interactively.

[NB-3] plugins/kbagent/skills/kbagent/references/gotchas.md — new plaintext_written JSON field not documented

The PR introduces a new field in the --json result envelope of config update, config new --push, config row-create, config row-update, config variables-set, and data-app secrets-set. AI agents and scripts that parse --json output can use plaintext_written to detect a plaintext-fallback leak programmatically. This is non-obvious behavior (a new key that is always present, [] on success) that belongs in gotchas.md per CONTRIBUTING.md ("if the command's behavior is non-obvious, add an entry tagged with (since vX.Y.Z)").

The PR is code-only with no version bump. The (since vX.Y.Z) tag should be added at the release that includes this PR (alongside #422/#460/#461/#462/#463). The existing config create/update + rows auto-encrypt entry at line 34 of gotchas.md would be a natural place to append a bullet: "plaintext_written: [] is always present in the --json result — empty on success, or a list of leaked key-paths when --allow-plaintext-on-encrypt-failure allowed a fallback."

Nits

  • [NIT-1] src/keboola_agent_cli/services/variables_service.py:269 — the docstring for _create_linked_variables says "The 4th tuple element is the plaintext-fallback leak". Prefer phrasing that matches the field name used elsewhere: "The 4th tuple element is plaintext_written — leaked key-paths ([] when encryption succeeded)." Symmetry with _update_linked_variables and _build_encrypted_row_configuration docstrings aids future readers.

Verification log

  • gh auth status → authenticated as padak (github.com), scopes include repo
  • gh pr view 465 --json state,additions,deletions,files → state=OPEN, +303/-17, 7 files, conventional feat(encrypt): prefix ✓
  • git rev-parse --abbrev-ref HEADfix/plaintext-written-structured (matches <branch>) ✓
  • Layer violation grep (typer/click in services, httpx in commands) → empty ✓
  • formatter.warning() guards json_mode internally (output.py:118) → no double-emit in JSON mode ✓
  • _emit_plaintext_written_warning called only inside else: (human-mode) blocks in config.py → ✓; formatter.warning() itself provides a second guard ✓
  • No new @*_app.command(...) decorators in the diff → no plugin synchronization map entries needed ✓
  • permissions.py OPERATION_REGISTRY check → no new commands, no entry needed ✓
  • Magic number grep → empty ✓; bare except: grep → empty ✓; print() in prod grep → empty ✓; raw error_code strings grep → empty ✓
  • _build_encrypted_row_configuration on main returned dict[str, Any] (confirmed via git show main:...); PR changes it to tuple[dict[str, Any], list[str]] → new tuple return (NB-1)
  • All three private methods (_create_linked_variables, _update_linked_variables, _build_encrypted_row_configuration) have only intra-class callers → no external API break, but CONTRIBUTING.md rule still applies
  • data_app_service.py:1366 plaintext_written: sorted(problems)problems correctly tracks keys whose encrypted ciphertext did not match ENCRYPTED_PASSWORD_PREFIXES and were written anyway; semantically accurate ✓
  • Searched for human-mode warning string ("were written in PLAINTEXT") in tests/ → no CliRunner test found (NB-2)
  • gotchas.md grep for plaintext_written → no entry found (NB-3)
  • make check4192 passed, 8 skipped, 127 deselected, 15 warnings in 85.55s, exit 0 ✓
  • Runtime reproduction: skipped — no --allow-plaintext-on-encrypt-failure scenario can be safely triggered against a live project without intentionally breaking the Encryption API. The service-level tests are comprehensive and sufficient.

Open questions for the author

(none)

padak added 3 commits June 23, 2026 23:13
Review feedback on the GHSA-7jrf follow-up:

- NB1 (BINDING CONTRIBUTING.md #0): _build_encrypted_row_configuration was
  a clean dict-return; the field PR turned it into tuple[dict, list[str]],
  introducing the bare-tuple anti-pattern to a previously-clean leaf helper.
  Keep it returning the row config and let the three call sites derive the
  leak via find_plaintext_secret_keys(row_config). The already-tuple callers
  (_create_linked_variables / _update_linked_variables) are unchanged.

- NB2 (test gap): add a CliRunner test locking the human-mode stderr warning
  ("were written in PLAINTEXT") paired with the --json plaintext_written
  field -- asserts the leaked key-PATH is named and the secret VALUE is not.

make check green: 4193 passed.
…t_written

NIT-1 from review: name the leak element `plaintext_written` in the
_create_linked_variables / _update_linked_variables docstrings, matching the
field name surfaced in the service result and the wording on
_build_encrypted_row_configuration. Docstring-only.
…lasses

Devin review: CONTRIBUTING.md mandates "multi-value returns use a @DataClass,
never a bare tuple beyond two values". The plaintext_written PR widened two
helpers past that line -- _update_linked_variables crossed a compliant 2-tuple
to a 3-tuple, and _create_linked_variables a 3-tuple to a 4-tuple -- so the
"fix it in the PR you are touching" rule (CONTRIBUTING.md #0) applies.

Introduce frozen _CreatedLinkedVariables / _UpdatedLinkedVariables and
destructure via named fields at the set_variables call sites. Behavior
unchanged; 4193 tests pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant