feat(dataapp-developer): AJDA-2840 add BigQuery Direct Storage Access examples#82
feat(dataapp-developer): AJDA-2840 add BigQuery Direct Storage Access examples#82sykora-ji wants to merge 4 commits into
Conversation
… examples Extend the dataapp-development skill so it stops emitting Snowflake-only SQL on BigQuery projects. - storage-access.md: new "BigQuery SQL dialect" section (backtick-per-segment quoting, bucket->dataset mangling, only the dataset is mangled, Storage Overview as the name source); unify on the Query Service as the preferred path on both backends with the Storage API workspace endpoint kept as an alternative; verified the Query Service returns string cells on BigQuery too (only the Storage API endpoint returns native types); document that INSERT/DML works on BigQuery via the Query Service, statements in one call share a session, and each statement must be exactly one SQL command. - streamlit-apps.md, dev-workflow.md, troubleshooting.md: align wording and add BigQuery quoting notes. - templates: note the BigQuery quoting/dataset adjustment. - TODO.md: drop resolved "Snowflake only" items; record verified BQ findings. - bump dataapp-developer to 1.3.0 (plugin.json + marketplace.json).
There was a problem hiding this comment.
Pull request overview
Updates the dataapp-development skill content in the dataapp-developer plugin to document BigQuery Direct Storage Access and align guidance so BigQuery projects don’t receive Snowflake-only SQL patterns.
Changes:
- Expanded
references/storage-access.mdwith BigQuery-specific guidance (dialect, dataset naming, Query Service usage/return-shape, and RW access notes). - Aligned other references and templates to the unified “prefer Query Service on both backends” guidance, adding BigQuery quoting/dataset-name notes.
- Bumped
dataapp-developerplugin version to1.3.0in both plugin and marketplace metadata.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| plugins/dataapp-developer/skills/dataapp-development/TODO.md | Updates validation status notes for BigQuery behavior and removes outdated TODO items. |
| plugins/dataapp-developer/skills/dataapp-development/templates/streamlit/streamlit_app.py | Adds a template comment warning about BigQuery quoting + dataset mangling. |
| plugins/dataapp-developer/skills/dataapp-development/templates/nodejs-app/api/queries.js | Adds BigQuery quoting/dataset guidance near the FQN constant. |
| plugins/dataapp-developer/skills/dataapp-development/templates/duckdb-cache/python/cache.py | Notes BigQuery quoting/dataset mangling for the “edit this SQL” section. |
| plugins/dataapp-developer/skills/dataapp-development/templates/duckdb-cache/nodejs/duck.js | Notes BigQuery quoting/dataset mangling for the “edit this SQL” section. |
| plugins/dataapp-developer/skills/dataapp-development/references/troubleshooting.md | Updates troubleshooting guidance to reflect Query Service preference on both backends. |
| plugins/dataapp-developer/skills/dataapp-development/references/streamlit-apps.md | Aligns Streamlit storage-access guidance to Query Service on both backends + adds BigQuery note. |
| plugins/dataapp-developer/skills/dataapp-development/references/storage-access.md | Main documentation updates: BigQuery dialect, Query Service guidance, return-shape notes, and alternative endpoint. |
| plugins/dataapp-developer/skills/dataapp-development/references/dev-workflow.md | Adds a BigQuery dialect note to the dev-workflow query example context. |
| plugins/dataapp-developer/.claude-plugin/plugin.json | Bumps dataapp-developer version to 1.3.0. |
| .claude-plugin/marketplace.json | Bumps marketplace entry for dataapp-developer to 1.3.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address PR review (Copilot): the BigQuery quoting rule was wrong. A
two-part `dataset.table` reference works whether you quote per segment
(`` `dataset`.`table` ``) or as a single pair (`` `dataset.table` ``) —
per-segment quoting is not required. The actual failure is adding a third
leading segment (the Keboola stage `in`/`out`, or splitting the dotted
bucket ID), which BigQuery resolves as a GCP project ("The project <stage>
has not enabled BigQuery").
Verified live against a real BigQuery project (in_c_shared_bucket.cashier-data):
`ds`.`tbl` and `ds.tbl` both succeed; `out`.`ds`.`tbl` and `out.ds.tbl` fail.
Rewrites the rule + examples in storage-access.md and drops the misleading
"per segment" wording from the other references and templates.
Address PR review (Copilot): - Replace ambiguous "no stage prefix" with "the in/out stage stays inside the mangled dataset name, not a separate segment" across storage-access.md, dev-workflow.md, and the four templates. The mangled name keeps in_/out_ (e.g. `in_c_main`); the rule is no extra stage *segment*. - Bump dataapp-developer README "## Version" to 1.3.0 to match plugin.json and marketplace.json.
|
Thanks — second review round addressed in "no stage prefix" was ambiguous (7 comments). Correct — the mangled dataset name keeps the stage ( README version (1 comment). Bumped PR description conflict (1 comment). Updated the PR description — removed the stale "backtick-per-segment / never around the whole FQN" phrasing so it matches the corrected section (both |
- Note the JS SDK method name `executeQuery` alongside Python's `execute_query` in the shared-session statement rule. - Use the full 3-part Snowflake FQN in the dialect comparison table to match the doc's own "always use the fully-qualified name" rule (the BigQuery row stays 2-part, which is correct for BigQuery).
|
Third round addressed in
|
link to issue
Description
Extends the
dataapp-developerplugin'sdataapp-developmentskill so it stops generating Snowflake-only SQL on BigQuery projects. Previously the skill claimed Direct Storage Access was "Snowflake only" and that the Query Service did not support BigQuery — both untrue. All changes are documentation/skill-content (no runtime code).references/storage-access.md(main change):dataset.tablereference (both`dataset`.`table`and`dataset.table`are valid; the trap is adding a third leading segment — the Keboolain/outstage stays inside the mangled dataset name, not a separate segment), bucket→dataset mangling (./-→_), the fact that only the dataset is mangled (table name keeps its form), the verifiedThe project <stage> has not enabled BigQueryerror, and Storage → Overview as the authoritative source for Dataset/Table names.sql_dialectselects the SQL syntax to emit, not which API to call. The Storage API workspace-query endpoint is kept as a documented alternative for BigQuery (not legacy).column.typecasing (Snowflake lowercase vs BigQuery uppercase).rows_affectedpopulated, round-trip confirmed), statements in oneexecute_querycall share a session, and each statement must be exactly one SQL command.Consistency across the skill:
streamlit-apps.md,dev-workflow.md, andtroubleshooting.mdaligned to the unified Query Service path with BigQuery quoting notes; the four code templates note the BigQuery quoting/dataset adjustment;TODO.mddrops the resolved "Snowflake only" items and records the verified findings.Versioning: bumped
dataapp-developerto1.3.0inplugin.jsonandmarketplace.json(new documented capability).All BigQuery behaviour above was verified live against a real BigQuery project via
keboola-query-service(quoting, mangling, read shape, INSERT round-trip). The only remaining untested path is adirect-grantwrite to a real Storage table from a deployed app (platform end-to-end, not skill behaviour) — recorded inTODO.md.Release Notes
main; distributed via the AI-kit marketplace plugin version (dataapp-developer1.3.0). No stack-by-stack rollout.