feat(dataapp-developer): AJDA-2840 add BigQuery Direct Storage Access examples by sykora-ji · Pull Request #82 · keboola/ai-kit

sykora-ji · 2026-06-25T10:26:35Z

Description

Extends the dataapp-developer plugin's dataapp-development skill so it stops generating Snowflake-only SQL on BigQuery projects. Previously the skill claimed Direct Storage Access was "Snowflake only" and that the Query Service did not support BigQuery — both untrue. All changes are documentation/skill-content (no runtime code).

references/storage-access.md (main change):

New "BigQuery SQL dialect" section: backtick identifier quoting for the two-part dataset.table reference (both `dataset`.`table` and `dataset.table` are valid; the trap is adding a third leading segment — the Keboola in/out stage stays inside the mangled dataset name, not a separate segment), bucket→dataset mangling (./- → _), the fact that only the dataset is mangled (table name keeps its form), the verified The project <stage> has not enabled BigQuery error, and Storage → Overview as the authoritative source for Dataset/Table names.
Unified the guidance on the Query Service as the preferred path on both backends; reframed backend routing so sql_dialect selects the SQL syntax to emit, not which API to call. The Storage API workspace-query endpoint is kept as a documented alternative for BigQuery (not legacy).
Corrected the return-shape section: verified the Query Service returns string cells on BigQuery too (only the Storage API endpoint returns native types); documented backend-specific column.type casing (Snowflake lowercase vs BigQuery uppercase).
Documented verified statement-level rules: INSERT/DML works on BigQuery via the Query Service (rows_affected populated, round-trip confirmed), statements in one execute_query call share a session, and each statement must be exactly one SQL command.
Replaced the "Snowflake only" claim in the Read-write Direct Storage Access section.

Consistency across the skill: streamlit-apps.md, dev-workflow.md, and troubleshooting.md aligned to the unified Query Service path with BigQuery quoting notes; the four code templates note the BigQuery quoting/dataset adjustment; TODO.md drops the resolved "Snowflake only" items and records the verified findings.

Versioning: bumped dataapp-developer to 1.3.0 in plugin.json and marketplace.json (new documented capability).

All BigQuery behaviour above was verified live against a real BigQuery project via keboola-query-service (quoting, mangling, read shape, INSERT round-trip). The only remaining untested path is a direct-grant write to a real Storage table from a deployed app (platform end-to-end, not skill behaviour) — recorded in TODO.md.

Release Notes

Justification
- The skill generated Snowflake-specific SQL and claimed BigQuery was unsupported, so on BigQuery projects it produced queries that fail (wrong quoting, wrong dataset names). This corrects the guidance based on verified testing (parent AJDA-2835) and mirrors the public docs shipped in docs: AJDA-2839 add BigQuery section to Storage Access docs connection-docs#986.
- In plain terms: the AI assistant that helps build Keboola data apps now writes correct database queries for BigQuery customers, not just Snowflake ones.
Plans for Customer Communication
- No customer communication needed. This is internal AI-kit skill/documentation content; no platform feature or API changes.
Impact Analysis
- No runtime impact — documentation/skill-content only. Nothing executes; the change only affects the guidance the assistant follows when generating data-app code. No feature flag. No single-tenant impact.
Deployment Plan
- Merge to main; distributed via the AI-kit marketplace plugin version (dataapp-developer 1.3.0). No stack-by-stack rollout.
Rollback Plan
- Fully reversible by reverting the commit / redeploying the previous plugin version. Not a one-way door.
Post-Release Support Plan
- No monitoring or Support notification required.

… examples Extend the dataapp-development skill so it stops emitting Snowflake-only SQL on BigQuery projects. - storage-access.md: new "BigQuery SQL dialect" section (backtick-per-segment quoting, bucket->dataset mangling, only the dataset is mangled, Storage Overview as the name source); unify on the Query Service as the preferred path on both backends with the Storage API workspace endpoint kept as an alternative; verified the Query Service returns string cells on BigQuery too (only the Storage API endpoint returns native types); document that INSERT/DML works on BigQuery via the Query Service, statements in one call share a session, and each statement must be exactly one SQL command. - streamlit-apps.md, dev-workflow.md, troubleshooting.md: align wording and add BigQuery quoting notes. - templates: note the BigQuery quoting/dataset adjustment. - TODO.md: drop resolved "Snowflake only" items; record verified BQ findings. - bump dataapp-developer to 1.3.0 (plugin.json + marketplace.json).

linear · 2026-06-25T10:26:39Z

AJDA-2840

Copilot

Pull request overview

Updates the dataapp-development skill content in the dataapp-developer plugin to document BigQuery Direct Storage Access and align guidance so BigQuery projects don’t receive Snowflake-only SQL patterns.

Changes:

Expanded references/storage-access.md with BigQuery-specific guidance (dialect, dataset naming, Query Service usage/return-shape, and RW access notes).
Aligned other references and templates to the unified “prefer Query Service on both backends” guidance, adding BigQuery quoting/dataset-name notes.
Bumped dataapp-developer plugin version to 1.3.0 in both plugin and marketplace metadata.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
plugins/dataapp-developer/skills/dataapp-development/TODO.md	Updates validation status notes for BigQuery behavior and removes outdated TODO items.
plugins/dataapp-developer/skills/dataapp-development/templates/streamlit/streamlit_app.py	Adds a template comment warning about BigQuery quoting + dataset mangling.
plugins/dataapp-developer/skills/dataapp-development/templates/nodejs-app/api/queries.js	Adds BigQuery quoting/dataset guidance near the FQN constant.
plugins/dataapp-developer/skills/dataapp-development/templates/duckdb-cache/python/cache.py	Notes BigQuery quoting/dataset mangling for the “edit this SQL” section.
plugins/dataapp-developer/skills/dataapp-development/templates/duckdb-cache/nodejs/duck.js	Notes BigQuery quoting/dataset mangling for the “edit this SQL” section.
plugins/dataapp-developer/skills/dataapp-development/references/troubleshooting.md	Updates troubleshooting guidance to reflect Query Service preference on both backends.
plugins/dataapp-developer/skills/dataapp-development/references/streamlit-apps.md	Aligns Streamlit storage-access guidance to Query Service on both backends + adds BigQuery note.
plugins/dataapp-developer/skills/dataapp-development/references/storage-access.md	Main documentation updates: BigQuery dialect, Query Service guidance, return-shape notes, and alternative endpoint.
plugins/dataapp-developer/skills/dataapp-development/references/dev-workflow.md	Adds a BigQuery dialect note to the dev-workflow query example context.
plugins/dataapp-developer/.claude-plugin/plugin.json	Bumps `dataapp-developer` version to `1.3.0`.
.claude-plugin/marketplace.json	Bumps marketplace entry for `dataapp-developer` to `1.3.0`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address PR review (Copilot): the BigQuery quoting rule was wrong. A two-part `dataset.table` reference works whether you quote per segment (`` `dataset`.`table` ``) or as a single pair (`` `dataset.table` ``) — per-segment quoting is not required. The actual failure is adding a third leading segment (the Keboola stage `in`/`out`, or splitting the dotted bucket ID), which BigQuery resolves as a GCP project ("The project <stage> has not enabled BigQuery"). Verified live against a real BigQuery project (in_c_shared_bucket.cashier-data): `ds`.`tbl` and `ds.tbl` both succeed; `out`.`ds`.`tbl` and `out.ds.tbl` fail. Rewrites the rule + examples in storage-access.md and drops the misleading "per segment" wording from the other references and templates.

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 9 comments.

Address PR review (Copilot): - Replace ambiguous "no stage prefix" with "the in/out stage stays inside the mangled dataset name, not a separate segment" across storage-access.md, dev-workflow.md, and the four templates. The mangled name keeps in_/out_ (e.g. `in_c_main`); the rule is no extra stage *segment*. - Bump dataapp-developer README "## Version" to 1.3.0 to match plugin.json and marketplace.json.

sykora-ji · 2026-06-25T11:33:32Z

Thanks — second review round addressed in 515c5b7 (plus a PR-description update). All nine comments were valid:

"no stage prefix" was ambiguous (7 comments). Correct — the mangled dataset name keeps the stage (in_c_main, out_c_analysis), so "no stage prefix" could be misread as dropping in_/out_. Reworded everywhere to: the in/out stage stays inside the single mangled dataset name, never a separate segment. Fixed in storage-access.md (the forward-pointer, the sql_dialect routing line, and the comparison table), the four templates, and also dev-workflow.md (same wording, not flagged but corrected for consistency).

README version (1 comment). Bumped dataapp-developer/README.md "## Version" to 1.3.0 to match plugin.json and marketplace.json.

PR description conflict (1 comment). Updated the PR description — removed the stale "backtick-per-segment / never around the whole FQN" phrasing so it matches the corrected section (both `dataset`.`table` and `dataset.table` are valid; the rule is no extra leading stage segment).

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

- Note the JS SDK method name `executeQuery` alongside Python's `execute_query` in the shared-session statement rule. - Use the full 3-part Snowflake FQN in the dialect comparison table to match the doc's own "always use the fully-qualified name" rule (the BigQuery row stays 2-part, which is correct for BigQuery).

sykora-ji · 2026-06-25T11:51:16Z

Third round addressed in 772bcba — both valid:

execute_query is Python-specific (storage-access.md). Added the JS name alongside it: "in one execute_query (Python) / executeQuery (JS) call".
Snowflake example omitted the database prefix (dialect table). Changed the Snowflake row to the full 3-part FQN "KBC_REGION_PROJID"."in.c-main"."customers" to match the doc's "always use the fully-qualified name" rule. The BigQuery row stays 2-part (`in_c_main`.`customers`), which is correct for BigQuery — no project/database segment.

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

sykora-ji requested a review from Copilot June 25, 2026 10:28

Copilot started reviewing on behalf of sykora-ji June 25, 2026 10:28 View session

sykora-ji requested a review from MiroCillik June 25, 2026 10:29

sykora-ji marked this pull request as ready for review June 25, 2026 10:29

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Comment thread plugins/dataapp-developer/skills/dataapp-development/references/storage-access.md Outdated

sykora-ji requested a review from Copilot June 25, 2026 10:59

Copilot started reviewing on behalf of sykora-ji June 25, 2026 11:00 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

sykora-ji requested a review from Copilot June 25, 2026 11:35

Copilot started reviewing on behalf of sykora-ji June 25, 2026 11:36 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Comment thread plugins/dataapp-developer/skills/dataapp-development/references/storage-access.md Outdated

Comment thread plugins/dataapp-developer/skills/dataapp-development/references/storage-access.md Outdated

sykora-ji requested a review from Copilot June 25, 2026 11:51

Copilot started reviewing on behalf of sykora-ji June 25, 2026 11:51 View session

Copilot AI reviewed Jun 25, 2026

View reviewed changes

Uh oh!

Conversation

sykora-ji commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Release Notes

Uh oh!

linear Bot commented Jun 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sykora-ji commented Jun 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

sykora-ji commented Jun 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sykora-ji commented Jun 25, 2026 •

edited

Loading