Create placeholder field in empty target deploy schemas by BenWu · Pull Request #9431 · mozilla/bigquery-etl

BenWu · 2026-05-22T20:25:32Z

Description

For target deploys, empty schemas (fields: []) are being created when the dry run to get schemas fails. The empty schema causes issues since the downstream tables and views can't query it. Example failed run: https://github.com/mozilla/bigquery-etl/actions/runs/26305494345/job/77442626593?pr=9429. This is blocking #9429

Creating a schema with one field allows at least SELECT * views to work. I think this hasn't come up before because there are very few cases where it would fail and they haven't been edited so they didn't get stage deployed. buildhub2 is one example that I used as a test failure: https://github.com/mozilla/bigquery-etl/actions/runs/26310264998/job/77457419956

Reviewer, please follow this checklist

github-actions · 2026-05-22T20:39:16Z

Integration report for "Create placeholder field in empty target deploy schemas"

`sql.diff`

Click to expand!

diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry/buildhub2/view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry/buildhub2/view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry/buildhub2/view.sql	2026-05-22 20:39:05.989816566 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry/buildhub2/view.sql	2026-05-22 20:38:59.458788857 +0000
@@ -4,6 +4,7 @@
 CREATE OR REPLACE VIEW
   `moz-fx-data-shared-prod.telemetry.buildhub2`
 AS
+-- test
 SELECT
   *
 FROM

Link to full diff

github-actions

This PR changes _fetch_stub_schema in bigquery_etl/util/target.py so that, when both client.get_table and Schema.for_table's dry-run fail to produce a non-empty schema for an unmanaged stub-target dependency, it writes a one-field placeholder schema (_bqetl_stub_placeholder STRING NULLABLE) instead of an empty fields: []. The function now also treats a successful-but-empty-fields response from either source as a failure and falls through to the placeholder. A new TestFetchStubSchema class covers the dry-run-empty-fields path.

Cross-cutting observations:

The intent — letting downstream SELECT * stage views survive an inaccessible source — is well scoped and the implementation is small and localized. The control flow (early-return on success, single placeholder write at the bottom) is a clean refactor of the previous nested try/except.
The main correctness concern is that Schema.for_table already swallows its own exceptions and returns cls({"fields": []}) (bigquery_etl/schema/__init__.py:109), so the new except Exception as e: for_table_err = e arm is effectively unreachable in normal use. The warning will print dry-run: None even though for_table printed the real error to stdout earlier — left inline.
Test coverage exercises one of three new fall-through paths (get_table raises + for_table returns empty fields). The empty-bq_table.schema branch and the both-raise branch aren't covered; adding parametrized cases would lock the placeholder behavior in.
Minor: typo in the test name, and _PLACEHOLDER_STUB_SCHEMA is a shared mutable module global — fine for today's read-only usage, easy to harden.

Reviewer checklist items applicable here (issue ref, schema impact, restricted namespace) are satisfied: the PR links #9429, doesn't add new query fields, and touches utility code rather than restricted SQL namespaces.

github-actions · 2026-05-22T20:42:33Z

+    for_table_err: Optional[Exception] = None
    try:
-        Schema.for_table(
+        schema = Schema.for_table(
            project=project,
            dataset=dataset,
            table=name,
            id_token=id_token,
            partitioned_by=resolve_partition_for(sql_dir, project, dataset, name),
-        ).to_yaml_file(out_path)
-    except Exception as for_table_err:
-        print(
-            f"Warning: Could not fetch schema for {project}.{dataset}.{name}: "
-            f"get_table: {get_table_err}; dry-run: {for_table_err}"
        )
+        if schema.schema.get("fields"):
+            schema.to_yaml_file(out_path)
+            return
+    except Exception as e:
+        for_table_err = e


issue: Schema.for_table already catches its own exceptions internally and returns cls({"fields": []}) on failure (see bigquery_etl/schema/__init__.py:109-111) — it prints the underlying error and swallows it. That means this except branch is rarely taken in practice, and the warning emitted below will almost always show dry-run: None, hiding the actual reason the dry-run failed. The genuinely informative signal is the empty-fields fallthrough, not the exception. Consider either (a) re-raising in Schema.for_table and letting this caller decide how to log, or (b) dropping for_table_err and tightening the warning so readers don't chase a None.

github-actions · 2026-05-22T20:42:45Z

+        f"get_table: {get_table_err}; dry-run: {for_table_err}. "
+        f"Writing placeholder schema for stage deploy."
+    )
+    Schema(_PLACEHOLDER_STUB_SCHEMA).to_yaml_file(out_path)


thought: downstream consumers that aren't simple SELECT * views (e.g., a view that references a specific column by name, or a query that joins on a particular field) will still fail against this placeholder — they'll just fail later in the deploy with a more obscure error than "empty schema." The PR description acknowledges the scope is SELECT * views, so this is fine, but it might be worth a short note in the warning ("downstream views that reference specific columns will still fail") so the deploy log makes the limitation obvious. Optional.

This is true but they fail at the same spot, just with the different error message so I don't think it makes a difference

BenWu added 2 commits May 22, 2026 16:24

Create placeholder field in empty target deploy schemas

dc252bd

Create placeholder field in empty target deploy schemas

099e31f

This comment has been minimized.

Sign in to view

move test change

5f30f11

BenWu marked this pull request as ready for review May 22, 2026 20:41

BenWu requested a review from a team as a code owner May 22, 2026 20:41

github-actions Bot reviewed May 22, 2026

View reviewed changes

add test, comments

d72a832

scholtzan approved these changes May 22, 2026

View reviewed changes

Merge branch 'main' into benwu/target-schema-placeholder

f9fad43

BenWu added this pull request to the merge queue May 22, 2026

Merged via the queue into main with commit 202e67e May 22, 2026
36 checks passed

BenWu deleted the benwu/target-schema-placeholder branch May 22, 2026 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create placeholder field in empty target deploy schemas#9431

Create placeholder field in empty target deploy schemas#9431
BenWu merged 5 commits into
mainfrom
benwu/target-schema-placeholder

BenWu commented May 22, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot May 22, 2026

Uh oh!

Uh oh!

Uh oh!

github-actions Bot May 22, 2026

Uh oh!

BenWu May 22, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BenWu commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

This comment has been minimized.

github-actions Bot commented May 22, 2026

Integration report for "Create placeholder field in empty target deploy schemas"

sql.diff

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

BenWu May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BenWu commented May 22, 2026 •

edited

Loading

`sql.diff`