Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"plugins": [
{
"name": "kbagent",
"version": "0.57.0",
"version": "0.58.0",
"source": "./plugins/kbagent",
"description": "AI-friendly interface to Keboola Connection projects — explore configs, jobs, lineage, call MCP tools, manage dev branches, and debug SQL in workspaces",
"category": "development"
Expand Down
2 changes: 1 addition & 1 deletion plugins/kbagent/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "kbagent",
"version": "0.57.0",
"version": "0.58.0",
"description": "AI-friendly interface to Keboola Connection projects — explore configs, jobs, lineage, call MCP tools, manage dev branches, and debug SQL in workspaces",
"author": {
"name": "Keboola",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -164,12 +164,12 @@ Bucket sharing + linking across projects in the same organization. `sharing edge

## Workspaces (SQL Debugging)
- `workspace create --project ALIAS [--name NAME] [--ui] [--read-only]` -- create workspace (headless ~1s, `--ui` ~15s). Since v0.47.1: Snowflake headless workspaces return a `private_key` PEM field; `password` is empty. BigQuery workspaces keep the default password credential shape.
- `workspace list [--project NAME ...] [--orphaned] [--branch ID] [--qs-compatible]` -- list workspaces. `--project` repeatable; `--orphaned` filters to workspaces whose backing `keboola.sandboxes` config is missing. **Since v0.42.0 (#304)**: each entry carries `login_type`, `read_only`, `qs_compatible`, `database`, `warehouse`. New `Login Type` / `RO` / `QS` columns in human mode. `--qs-compatible` pre-filters to RO + whitelisted-loginType workspaces (the canonical data-app shape). `--branch` requires exactly one `--project`; without `--branch`, the command behaves like `storage buckets` and uses production with an `Info: Using production branch for read (active dev branch X ignored; pass --branch X to override)` banner when an alias is pinned to a dev branch
- `workspace detail --project ALIAS --workspace-id ID [--branch ID]` -- show connection details. **Since v0.42.0 (#304)**: response carries `login_type`, `read_only`, `qs_compatible`; human mode adds `Login type:` / `Read-only:` / `Query Service compatible:` rows. `--branch` opt-in mirrors `workspace list`
- `workspace list [--project NAME ...] [--orphaned] [--branch ID] [--qs-compatible]` -- list workspaces. `--project` repeatable; `--orphaned` filters to workspaces whose backing `keboola.sandboxes` config is missing. **Since v0.42.0 (#304)**: each entry carries `login_type`, `read_only`, `qs_compatible`, `database`, `warehouse`. New `Login Type` / `RO` / `QS` columns in human mode. `--qs-compatible` pre-filters to RO + whitelisted-loginType workspaces (the canonical data-app shape). **Updated v0.58.0**: `qs_compatible` is keyed by `(backend, loginType)` -- BigQuery workspaces (loginType `default`) now report `qs_compatible: true` and pass `--qs-compatible`; pre-0.58.0 every BigQuery workspace was wrongly excluded (Snowflake's own legacy `default` stays `false`). `--branch` requires exactly one `--project`; without `--branch`, the command behaves like `storage buckets` and uses production with an `Info: Using production branch for read (active dev branch X ignored; pass --branch X to override)` banner when an alias is pinned to a dev branch
- `workspace detail --project ALIAS --workspace-id ID [--branch ID]` -- show connection details. **Since v0.42.0 (#304)**: response carries `login_type`, `read_only`, `qs_compatible`; human mode adds `Login type:` / `Read-only:` / `Query Service compatible:` rows. **Updated v0.58.0**: BigQuery `default` workspaces now report `qs_compatible: true` (was `false`). `--branch` opt-in mirrors `workspace list`
- `workspace delete --project ALIAS --workspace-id ID` -- delete workspace
- `workspace password --project ALIAS --workspace-id ID` -- reset and return new password
- `workspace load --project ALIAS --workspace-id ID --tables TABLE_ID [...] [--preserve]` -- load storage tables
- `workspace query --project ALIAS --workspace-id ID --sql "..." [--file F] [--transactional]` -- run SQL via Query Service
- `workspace query --project ALIAS --workspace-id ID --sql "..." [--file F] [--transactional]` -- run SQL via Query Service. **Backend-agnostic since v0.58.0**: runs against both Snowflake and BigQuery workspaces (the path was always identical; BigQuery just needed the classification fix). Mind the dialect: Snowflake quotes identifiers with `"..."`, BigQuery with backticks `` `...` ``
- `workspace gc [--project NAME ...] [--dry-run] [--yes]` -- garbage-collect orphaned workspaces (and any lingering `keboola.sandboxes` configs). `--dry-run` previews without deleting; `--project` repeatable, omit to GC across all connected projects
- `workspace from-transformation --project ALIAS --component-id ID --config-id ID [--row-id ID]` -- workspace from existing transform

Expand Down
17 changes: 15 additions & 2 deletions plugins/kbagent/skills/kbagent/references/gotchas.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,17 +260,30 @@ plus a derived `qs_compatible: bool`.
}
```

**Compatibility whitelist (`constants.QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES`):**
**Compatibility is keyed by (backend, loginType) -- since v0.58.0.** The same
`default` string means opposite things per backend, so there are two whitelists.

Snowflake (`constants.QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES`):

- `snowflake-service-keypair` -- confirmed PASS
- `snowflake-person-sso` -- confirmed PASS
- `snowflake-person-keypair` -- confirmed PASS (since v0.47.1)
- `snowflake-legacy-service` -- explicitly OFF the list (works on
`connection.keboola.com` but FAILED on GCP us-east4 stack in the
original #304 incident -- keep it off until cross-stack confirmation)
- `default` (legacy 2016 workspaces) -- confirmed FAIL
- `default` on Snowflake (legacy 2016 workspaces) -- confirmed FAIL
(`JWT token is invalid`)

BigQuery (`constants.QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES_BIGQUERY`):

- `default` on BigQuery -- confirmed PASS (since v0.58.0). Every BigQuery
workspace carries loginType `default` (the sandbox API exposes no
Snowflake-style variants for BigQuery), and the Query Service runs SELECTs
against it -- verified live against project 9621 on `connection.keboola.com`.
Before v0.58.0, kbagent's whitelist was Snowflake-only, so BigQuery
workspaces were mislabeled `qs_compatible: false` and hidden by
`workspace list --qs-compatible`, even though `workspace query` worked.

`qs_compatible: false` does NOT mean "broken"; it means "not on the
confirmed-good whitelist". For an unknown loginType, `workspace list`
renders it as `?` (yellow) in the QS column so callers know the policy
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ workspace.
- **Snowflake**: converts unquoted identifiers to UPPERCASE. Always double-quote database, schema, and table names -- Keboola names are typically lowercase (e.g. `"sapi_901"."in.c-main"."users"`).
- **BigQuery**: requires backticks (`` ` ``), not double quotes; the dataset name is normalized to underscores (e.g. `` `in_c_main`.`users` ``).
- Easiest path: read `tables[].sql_path` from `bucket-detail` -- it is already correctly quoted for the bucket's backend (since v0.25.3).
- **Query Service**: uses Storage API token for auth -- no Snowflake credentials needed in the query command
- **Query Service**: uses Storage API token for auth -- no warehouse credentials needed in the query command. Backend-agnostic: runs SELECTs against **both Snowflake and BigQuery** workspaces (BigQuery since v0.58.0; the path was always identical, the gap was classification). BigQuery workspaces carry `login_type: "default"` and are `qs_compatible: true` from v0.58.0 -- earlier versions mislabeled them `false`.
- **Transactional mode**: add `--transactional` to wrap SQL in a transaction

## Orphan detection + garbage collection (since v0.22.0)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "keboola-agent-cli"
version = "0.57.0"
version = "0.58.0"
description = "AI-friendly CLI for managing Keboola projects"
readme = "README.md"
requires-python = ">=3.12"
Expand Down
30 changes: 30 additions & 0 deletions src/keboola_agent_cli/changelog.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,36 @@

# Ordered newest-first. Each value is a list of brief one-line descriptions.
CHANGELOG: dict[str, list[str]] = {
"0.58.0": [
"New: `kbagent workspace query` runs SQL against BigQuery workspaces, not just Snowflake. "
"The Query Service path was always backend-agnostic (`POST "
"/api/v1/branches/{b}/workspaces/{w}/queries` + CSV export are identical for both backends), so "
"this was a classification + error-legibility fix rather than a new execution path -- verified "
"live against project 9621 (e2e-bigquery) on connection.keboola.com, including a real-data "
"`workspace load` + `query` round-trip. Mind the dialect: Snowflake quotes identifiers with "
'`"..."`, BigQuery with backticks `` `...` ``.',
"Fix: BigQuery workspaces are no longer mislabeled `qs_compatible: false`. `qs_compatible` is now "
"keyed by (backend, loginType): BigQuery's `default` loginType is whitelisted via the new "
"`QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES_BIGQUERY`, kept separate from the Snowflake whitelist "
"because Snowflake's own legacy `default` is rejected by the Query Service ('JWT token is "
"invalid') -- the same string means compatible for BigQuery and incompatible for Snowflake. "
"Pre-0.58.0 every BigQuery workspace was wrongly hidden by `workspace list --qs-compatible` and "
"shown incompatible in `workspace detail`, even though queries ran fine.",
"Change: `workspace create` on a BigQuery project now requests loginType `default` explicitly. "
"It is the only BigQuery loginType and matches keboola-mcp-server, rather than omitting it and "
"relying on the backend default; Snowflake key-pair creation is unchanged.",
"Fix: BigQuery query errors now read as plain text instead of a serialized wrapper. The Query "
'Service returns a failed BigQuery statement as `{Location: ...; Message: "..."; Reason: ...}`; '
"the new `_unwrap_bigquery_error` (`client.py`) extracts the inner `Message` so the error box "
"matches Snowflake's plain text (Snowflake errors have no wrapper and pass through untouched). "
"Tests: `TestBigQueryQueryServiceSupport`, `TestUnwrapBigQueryError`, a BigQuery case in "
"`TestExtractQueryJobError`, and a backend-aware `test_e2e.py` workspace query.",
"New (#401): `kbagent changelog` now shows a one-line summary per version by default, with "
"`--full` / `-v` to expand every note. Entries follow an authoring contract -- one logical change "
"per prefixed bullet (`New:`/`Fix:`/`Change:`/...), leading with a self-contained first sentence "
"-- so the default view and the post-update 'What's new' banner stay scannable instead of "
"rendering a wall of text.",
],
"0.57.0": [
"BREAKING (flow / conditional flows): the `flow` command group now targets "
"conditional flows (`keboola.flow`) ONLY; `keboola.orchestrator` support is "
Expand Down
35 changes: 30 additions & 5 deletions src/keboola_agent_cli/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

import json
import logging
import re
import time
from collections.abc import Iterator
from pathlib import Path
Expand Down Expand Up @@ -2743,6 +2744,25 @@ def wait_for_query_job(self, query_job_id: str) -> dict[str, Any]:
)


# The Query Service surfaces BigQuery errors as a serialized object string, e.g.
# {Location: "query"; Message: "Syntax error: Unexpected identifier ..."; Reason: "invalidQuery"}
# Pull out the human-readable `Message: "..."` part so a BigQuery failure reads
# like Snowflake's plain text instead of leaking the wrapper into the user's red
# error box. Mirrors keboola-mcp-server's `_BigQueryWorkspace._format_error_message`.
_BQ_ERROR_MESSAGE_RE = re.compile(r'Message:\s*"((?:[^"\\]|\\.)*)"')


def _unwrap_bigquery_error(message: str) -> str:
"""Extract the inner message from a serialized BigQuery Query-Service error.

Snowflake errors are plain strings with no ``Message: "..."`` wrapper, so
they pass through unchanged. Only the BigQuery object shape is rewritten.
"""
if message and (match := _BQ_ERROR_MESSAGE_RE.search(message)):
return match.group(1).replace('\\"', '"')
return message


def _extract_query_job_error(job: dict[str, Any]) -> str:
"""Pull the most useful warehouse error message out of a failed Query Service job.

Expand Down Expand Up @@ -2772,14 +2792,19 @@ def _extract_query_job_error(job: dict[str, Any]) -> str:

def _as_text(err: Any) -> str:
if isinstance(err, str):
return err.strip()
if isinstance(err, dict):
raw = err.strip()
elif isinstance(err, dict):
raw = ""
for key in ("message", "error", "detail"):
val = err.get(key)
if isinstance(val, str) and val.strip():
return val.strip()
return ""
return str(err).strip() if err is not None else ""
raw = val.strip()
break
else:
raw = str(err).strip() if err is not None else ""
# BigQuery wraps the real message in a serialized object; Snowflake plain
# text passes through untouched.
return _unwrap_bigquery_error(raw)

statement_errors: list[str] = []
for i, stmt in enumerate(job.get("statements") or []):
Expand Down
21 changes: 21 additions & 0 deletions src/keboola_agent_cli/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,11 @@
# default (legacy 2016 ws): FAIL ('JWT token is invalid')
#
# Extend ONLY after empirical confirmation across at least one non-AWS stack.
#
# This whitelist is SNOWFLAKE-SCOPED. BigQuery compatibility lives in its own
# set below because the `default` loginType means opposite things per backend
# (see BIGQUERY_WORKSPACE_LOGIN_TYPE). Compatibility is therefore keyed by
# (backend, loginType) -- see `_classify_qs_compatibility`.
SNOWFLAKE_WORKSPACE_LOGIN_TYPE: str = "snowflake-person-keypair"
QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES: frozenset[str] = frozenset(
{
Expand All @@ -377,6 +382,22 @@
}
)

# --- BigQuery Query Service compatibility (since v0.58.0) ---
# BigQuery workspaces carry a single `default` loginType -- the sandbox API does
# not expose Snowflake-style variants for BigQuery. The Query Service accepts it:
# verified 2026-06-04 against project 9621 on connection.keboola.com, where a
# `SELECT` against a read-only `default` BigQuery workspace returns rows.
#
# CRITICAL: `default` is on the BigQuery whitelist but deliberately OFF the
# Snowflake one above. Snowflake ALSO mints a `default` loginType (legacy 2016
# workspaces) which the Query Service REJECTS ('JWT token is invalid'). Keying
# compatibility on loginType alone would wrongly green-light those legacy
# Snowflake workspaces, so `_classify_qs_compatibility` dispatches on backend.
BIGQUERY_WORKSPACE_LOGIN_TYPE: str = "default"
QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES_BIGQUERY: frozenset[str] = frozenset(
{BIGQUERY_WORKSPACE_LOGIN_TYPE}
)

# --- Permission Exit Code ---
EXIT_PERMISSION_DENIED: int = 6
# --- Job-timeout Exit Code ---
Expand Down
44 changes: 33 additions & 11 deletions src/keboola_agent_cli/services/workspace_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,12 @@
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa

from ..constants import QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES, SNOWFLAKE_WORKSPACE_LOGIN_TYPE
from ..constants import (
BIGQUERY_WORKSPACE_LOGIN_TYPE,
QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES,
QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES_BIGQUERY,
SNOWFLAKE_WORKSPACE_LOGIN_TYPE,
)
from ..errors import ConfigError, ErrorCode, KeboolaApiError
from ..models import ProjectConfig
from .base import BaseService
Expand All @@ -28,22 +33,37 @@ class SnowflakeWorkspaceKeyPair:
public_pem: str


def _classify_qs_compatibility(login_type: str) -> bool:
"""Map a Storage API workspace ``connection.loginType`` to Query-Service compat.
def _classify_qs_compatibility(login_type: str, backend: str) -> bool:
"""Map a workspace ``(loginType, backend)`` pair to Query-Service compat.

Compatibility is keyed by BOTH backend and loginType because the same
``default`` string means opposite things per backend: a BigQuery workspace's
``default`` loginType IS Query-Service-compatible (verified against project
9621 on connection.keboola.com), whereas a Snowflake legacy ``default``
workspace is NOT ('JWT token is invalid'). See the two whitelists in
``constants`` for the empirical rationale.

Conservative whitelist semantics: returns True only for ``loginType``s
confirmed to work with POST /v2/storage/branch/{ID}/workspaces/{WS}/query.
See ``constants.QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES`` for the rationale
behind why ``snowflake-legacy-service`` (issue #304) stays off the list
even though it works on some stacks.
Unknown backends fall through to the Snowflake whitelist (false negatives
over false positives).
"""
if backend.lower() == "bigquery":
return login_type in QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES_BIGQUERY
return login_type in QUERY_SERVICE_COMPATIBLE_LOGIN_TYPES


def _workspace_login_type_for_backend(backend: str) -> str | None:
"""Return the loginType kbagent should request for newly created workspaces."""
if backend.lower() == "snowflake":
normalized = backend.lower()
if normalized == "snowflake":
return SNOWFLAKE_WORKSPACE_LOGIN_TYPE
if normalized == "bigquery":
# BigQuery's Query-Service-compatible loginType. Omitting it lets the
# backend default to the same value, but requesting it explicitly keeps
# parity with keboola-mcp-server and is robust to a server-side change
# of the implicit default.
return BIGQUERY_WORKSPACE_LOGIN_TYPE
return None


Expand Down Expand Up @@ -478,12 +498,13 @@ def worker(
config_id = ws.get("configurationId") or ""
component_id = ws.get("component") or ""
login_type = connection.get("loginType", "") or ""
backend = connection.get("backend", "") or ""
read_only = bool(ws.get("readOnlyStorageAccess", False))
entry = {
"project_alias": alias,
"id": ws.get("id"),
"name": config_names.get(str(config_id), ws.get("name", "")),
"backend": connection.get("backend", ""),
"backend": backend,
"host": connection.get("host", ""),
"database": connection.get("database", ""),
"warehouse": connection.get("warehouse", ""),
Expand All @@ -494,7 +515,7 @@ def worker(
"config_id": config_id,
"login_type": login_type,
"read_only": read_only,
"qs_compatible": _classify_qs_compatibility(login_type),
"qs_compatible": _classify_qs_compatibility(login_type, backend),
}
if orphaned_only:
if _is_orphaned_workspace(entry, config_names):
Expand Down Expand Up @@ -647,10 +668,11 @@ def get_workspace(

connection = ws_data.get("connection", {})
login_type = connection.get("loginType", "") or ""
backend = connection.get("backend", "") or ""
return {
"project_alias": alias,
"workspace_id": ws_data.get("id"),
"backend": connection.get("backend", ""),
"backend": backend,
"host": connection.get("host", ""),
"warehouse": connection.get("warehouse", ""),
"database": connection.get("database", ""),
Expand All @@ -659,7 +681,7 @@ def get_workspace(
"created": ws_data.get("created", ""),
"login_type": login_type,
"read_only": bool(ws_data.get("readOnlyStorageAccess", False)),
"qs_compatible": _classify_qs_compatibility(login_type),
"qs_compatible": _classify_qs_compatibility(login_type, backend),
"component_id": ws_data.get("component", "") or "",
"config_id": ws_data.get("configurationId", "") or "",
}
Expand Down
Loading
Loading