Skip to content

Claude merge 3#2706

Open
dimoffon wants to merge 4594 commits into
adb-8.xfrom
claude-merge-3
Open

Claude merge 3#2706
dimoffon wants to merge 4594 commits into
adb-8.xfrom
claude-merge-3

Conversation

@dimoffon

Copy link
Copy Markdown
Member

Bump up to PostgreSQL15

dimoffon and others added 30 commits June 10, 2026 14:20
…G14)

ORCA's DML plan for a partitioned target goes through a dynamic scan
against the partition root, with ModifyTable.resultRelations naming only
the root; finding the leaf a tuple belongs to relied on
ModifyTable.forceTupleRouting, whose executor consumer was removed
during the PG14 nodeModifyTable rework (b04e559) -- PG14 routes
inherited updates via per-leaf result relations and "tableoid" junk
columns instead, which ORCA's plans don't provide.  An in-place
UPDATE/DELETE therefore tried to modify the storage-less partition root:

    ERROR:  could not open file "pg_tblspc/0/GPDB_8_.../0/0"

(catcache, qp_dropped_cols and the wider could-not-open-file failure
cluster; split updates that modify the distribution key survived only
because their INSERT half goes through the partitioned-INSERT routing.)

Restore the pre-#14129 guards in TranslateUpdateQueryToDXL and
TranslateDeleteQueryToDXL so ORCA raises ExmiQuery2DXLUnsupportedFeature
and falls back to the Postgres planner, which handles partitioned
UPDATE/DELETE correctly on PG14.  INSERT stays with ORCA.  Revisit by
porting per-tuple leaf routing onto the PG14 executor model.

Verified under optimizer=on: partitioned non-key UPDATE, partition-key
(cross-partition) UPDATE, partitioned DELETE all correct; UPDATE routing
a NULL partition key reports the proper "no partition found" error
instead of the storage error.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… dispatched" (PG14)

Since PG13 (commit 5028981), CREATE TABLE (LIKE ... INCLUDING INDEXES)
defers index creation: transformCreateStmt leaves a TableLikeClause in
the statement list, and ProcessUtilitySlow expands it later via
expandTableLikeClause() into IndexStmts marked transformed=true.  The
upstream T_IndexStmt path maps transformed -> is_alter_table=true
("treat it like ALTER TABLE ADD INDEX"), and GPDB's DefineIndex()
suppresses its QE dispatch when is_alter_table on the assumption that an
enclosing ALTER TABLE is dispatched as a whole.

For the LIKE path there is no enclosing command: the CreateStmt was
already dispatched with its own oids, and the cloned IndexStmt was
executed only on the QD.  The index oids preassigned there were never
sent ("ERROR: oids were assigned, but not dispatched to QEs") and the
index was missing on the segments.  This broke CREATE TABLE LIKE
INCLUDING INDEXES/ALL across alter_table, partition1, partition_storage,
index_constraint_naming*, and bfv_index.

Dispatch the transformed IndexStmt explicitly from ProcessUtilitySlow's
T_IndexStmt case, mirroring DefineIndex's own dispatch (same flags, name
pinned via stmt->idxname, oldNode cleared, preassigned oids attached).
The QE re-executes it with is_alter_table=true and consumes the
dispatched oids.

Verified: CREATE TABLE (LIKE src INCLUDING ALL) succeeds, the pkey index
exists on the QD and on every segment, and the unique constraint is
enforced segment-side (duplicate key rejected).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…R_READY (PG14)

PG14 added CAC_NOTCONSISTENT (pmState == PM_RECOVERY) to reject
connections to a hot-standby that has not reached consistency.  The
merge placed that branch in canAcceptConnections() above GPDB's
GetMirrorReadyFlag() -> CAC_MIRROR_READY check.  A GPDB mirror runs with
hot_standby off and stays in PM_RECOVERY for its whole life, so the
mirror-ready branch became dead code: every connection to a mirror was
answered with "the database system is not accepting connections /
Hot standby mode is disabled."

CAC_NOTCONSISTENT has no FTS exemption in ProcessStartupPacket (only
CAC_STARTUP and CAC_MIRROR_READY do), so the FTS probe process could not
connect to any mirror at all: probes ended in "FTS double fault
detected", promotion requests never reached the mirror (catalog flipped
to role=p while the segment kept running as a standby -- standby.signal
in place, walreceiver streaming), and gprecoverseg failed because it
could not read the version string from the CAC_MIRROR_READY error.
Every mirror failover wedged the cluster unrecoverably, and the FTS
regress tests (fts_error, fts_recovery_in_progress, ...) hung.

Return CAC_MIRROR_READY before the CAC_NOTCONSISTENT branch when the
walreceiver has been launched and hot_standby is off (a GPDB mirror).
Genuine hot-standby servers that merely have not reached consistency
keep the upstream fail-fast CAC_NOTCONSISTENT behavior.

Verified end-to-end: direct connection to a mirror reports the
version-bearing mirror-ready error; killing a primary now leads to FTS
truly promoting the mirror (standby.signal removed, segment accepts
queries, QD queries work across the failover); gprecoverseg -a and -ar
restore and rebalance the cluster.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gprecoverseg incremental recovery runs

    pg_rewind --write-recovery-conf --slot="internal_wal_replication_slot" ...

The PG14 merge took upstream pg_rewind wholesale (46f49ad), dropping
the GPDB-added --slot option (eacc688), so every incremental
recovery failed with "unrecognized option '--slot=...'" and left the
downed segment unrecovered.

Re-add -S/--slot on top of the PG14 implementation: upstream's
GenerateRecoveryConfig() (shared with pg_basebackup) already takes a
replication-slot argument and emits primary_slot_name; pass the option
through at both -R call sites and reject --slot without
--write-recovery-conf, as before.

Verified: gprecoverseg -a incremental recovery succeeds ("Segments
successfully recovered", mirror back in sync) and gprecoverseg -ar
rebalances to preferred roles.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(PG14)

GPDB append-optimized tables cannot fetch the old tuple by TID, so an
UPDATE plan over them must emit the full new tuple:
preprocess_targetlist() expands the targetlist when the target relation
is AO, and ExecModifyTable leaves the old slot empty for AO result
relations on the strength of that contract.

With PG14 native partitioning the target of a partitioned UPDATE is the
storage-less root, for which RelationIsAppendOptimized() is always
false, so the expansion never happened when only the leaves are AO.
The per-leaf update projection then referenced old-tuple columns the AO
leaf could not provide, failing with

    ERROR:  getsomeattrs is not required to be called on a virtual tuple table slot

across alter_table_aocs*, expand_table_ao*, alter_ao_part_tables*,
alter_ao_part_exch* (10 regress tests).

Add rel_has_appendoptimized_partition(): for a partitioned target, scan
its inheritors and force the expansion when any of them uses an AO
access method.  Split updates already expanded; pure-heap partitioned
updates keep the upstream narrow targetlist.

Verified: AO-row and AOCS partitioned UPDATEs return correct results;
heap partitioned UPDATE unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaying a 2PC DROP TABLESPACE (COMMIT_PREPARED with GPDB's
tablespace_oid_to_delete_on_commit) on a mirror could die with

    FATAL:  could not open directory "<location>/<dbid>": No such file or directory
    CONTEXT: WAL redo at ... for Transaction/COMMIT_PREPARED ...

destroy_tablespace_directories() downgrades its own errors to LOG under
redo, but the directory_is_empty() check on the symlink target uses
ReadDir at ERROR, which the startup process escalates to FATAL -- so a
vanished/unreadable target directory took the whole mirror down over
disk space we merely failed to release.  FTS then marked the mirror
down and pg_regress aborted the suite (temp_tablespaces /
alter_db_set_tablespace window).

Add directory_is_empty_ext() with a caller-chosen elevel and use it in
the redo path (LOG); an unreadable directory counts as empty and the
subsequent rmdir's LOG reports the leftover.

Verified: create tablespace -> create/insert/drop table -> drop
tablespace replays cleanly; all mirrors stay up and in sync.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…g group (PG14)

A merge artifact in ExplainOnePlan left a dangling
"if (es->summary && (planduration || bufusage))" glued onto the
query-identifier condition, plus a GPDB6-leftover second buffer-usage
block ending in an ExplainCloseGroup("Planning") with no matching open.
Whenever ANALYZE ran without BUFFERS, that stray close popped the
"Query" group early: every key after "Planning Time" (Triggers, Slice
statistics, Execution Time) was emitted outside the object, producing
structurally invalid JSON ("Expected , or ] but found :").  Text format
hid it because group closes are no-ops there; explain, explain_format,
gin and join_hash failed on it.

Restore the upstream PG14 shape (queryId, then the Planning group
wrapping only planning buffer usage, then Planning Time), keeping
GPDB's slice-table print, and drop the duplicate buffer block.

Verified: EXPLAIN (FORMAT JSON, ANALYZE) and (FORMAT JSON, ANALYZE,
BUFFERS) both parse with json.loads.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The expanded append-optimized UPDATE targetlist kept NULL placeholders
for dropped columns (resno == attno), but PG14's
ExecBuildUpdateProjection() pairs every non-junk subplan column with an
update_colnos target and rejects dropped target columns:

    ERROR:  table row type and query-specified row type do not match
    DETAIL:  Query provides a value for a dropped column at ordinal position N.

broke every AO/AOCS UPDATE on a table with a dropped column
(alter_table_gp, drop_column_update, alter_table_analyze,
alter_ao_table_col_ddl_*, uao_allalter_*).

For a plain (non-split) AO update, strip the dropped-column
placeholders after expansion and renumber the resnos; the executor sets
dropped columns of the new tuple to NULL itself and, with every live
column assigned, never reads the old-tuple slot that AO cannot
populate.  A Split Update keeps the full physical row: it runs as
delete+insert and never builds the update projection.

Also includes rel_has_appendoptimized_partition() interplay: partitioned
AO targets take the same path.

Verified: AO and AOCS dropped-column UPDATEs return correct results.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaying Database/CREATE (ALTER DATABASE SET TABLESPACE, movedb) on a
mirror copies a live database directory while the checkpointer can
unlink files of dropped relations at restartpoints.  copy_file()/lstat
then died with

    FATAL:  could not open file ".../<relfilenode>_fsm": No such file or directory
    CONTEXT: WAL redo ... Database/CREATE: copy dir ...

killing the startup process and downing the mirror (alter_db_set_tablespace
aborted the whole regress suite this way).  The primary's copy simply
never saw those files.

Skip ENOENT sources with a LOG during recovery (InRecovery) in both the
directory scan and the file copy; normal execution still errors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After an incremental (pg_rewind) recovery a mirror replays from before
a CREATE TABLESPACE whose location directory has since been removed
from disk (regression tests drop the tablespace and clean up the
directory).  create_tablespace_directories() then FATALed the startup
process with "directory does not exist", leaving the mirror permanently
unrecoverable short of a full rebuild.

During recovery, create the missing location with pg_mkdir_p() and
press on -- the same philosophy TablespaceCreateDbspace() documents for
replaying into dropped tablespaces.  Normal execution still errors.

Also includes the directory_is_empty_ext() redo hardening in the drop
path from the previous commit series.

Verified: a mirror rewound to before CREATE TABLESPACE replays through
create/use/drop of tablespaces and returns to sync; "creating missing
directory ... during replay" appears in its log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two problems in TranslateDXLDml on tables with dropped columns:

1. A plain (non-split) UPDATE padded the subplan target list with NULL
   placeholders for dropped columns and listed their attnos in
   updateColnosLists; PG14's ExecBuildUpdateProjection() rejects
   assignments to dropped columns ("Query provides a value for a
   dropped column").  Skip the padding for non-split updates and build
   updateColnosLists from the live columns' attribute numbers; the
   executor nulls dropped columns of the new tuple itself.

2. A Split Update (delete+insert, distribution key change) silently
   CORRUPTED rows: the insert half wrote misaligned values (a SET a=...
   update lost the other columns' values).  Until the PG14 insert path
   understands ORCA's padded rows, raise unsupported and fall back to
   the Postgres planner, which handles it correctly.

Verified under optimizer=on: AO and heap dropped-column UPDATEs return
correct results; a distribution-key UPDATE on a dropped-column table
falls back and preserves all column values.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A mis-merged brace in heap_create_with_catalog() attached the outer
"relkind has no rowtype" else-branch to the inner GPDB
"skip array type for AO relations" if-statement.  For every AO/AOCS
relation the composite type was created (pg_type row present, typrelid
correct) and then new_type_oid was reset to InvalidOid, so the pg_class
tuple was written with reltype = 0.

Fallout: RenameRelationInternal() skips RenameTypeInternal() when
reltype is invalid, so renaming an AO table left its rowtype under the
old name.  ALTER TABLE EXCHANGE PARTITION decomposes into a three-way
rename and collided with the stale type ("type <partition> already
exists"), breaking 15 regress tests (partition, partition1,
distributed_transactions, alter_table_ao*, alter_ao_part_*,
column_compression, oid_consistency, portals_updatable); anything
consulting an AO relation's rowtype (whole-row Vars, "relation does not
have a composite type") misbehaved too.

Move the else to the outer if, where upstream has it.  Newly created
clusters/tables get correct catalogs; existing AO tables keep reltype=0
until recreated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nodes (PG14)

PG14's CREATE FUNCTION/PROCEDURE ... BEGIN ATOMIC / RETURN stores the
body in CreateFunctionStmt.sql_body.  The field was copied and compared
(copyfuncs/equalfuncs) but never serialized, so the QD dispatched the
statement without a body and QEs failed with "no function body
specified" (create_procedure, create_function_3).

Dispatching the raw body surfaced further binary-serialization gaps:

- ReturnStmt and RawStmt had no readers at all; add _readReturnStmt /
  _readRawStmt and wire both node types into the outfast/readfast
  switches.
- _readSelectStmt did not read the PG14 groupDistinct bool that
  _outSelectStmt writes, desynchronizing the stream one byte
  ("could not deserialize unrecognized node type: <garbage>").
- ParamRef had a writer but no reader, breaking RETURN $1-style bodies.

Verified: BEGIN ATOMIC procedure inserts through dispatch, RETURN $1*2
function evaluates on segments, GROUP BY DISTINCT round-trips.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…PG14)

GPDB samples AO/AOCS relations by logical row number: acquire_sample_rows
sets totalblocks to the tuple count and the table AM's
scan_analyze_next_block interprets the sampled "block numbers"
accordingly.  PG14's new analyze prefetching (commit c6fc50c) is
unaware of that convention and calls PrefetchBuffer(rel, MAIN_FORKNUM,
<row number>), which drives md smgr into the append-optimized segment
files:

    ERROR:  could not open file "base/.../<relfilenode>.1" (target block N):
            previous segment is only 0 blocks (md.c:1382)

breaking ANALYZE on any AO table large enough for the sampler to pick a
"block" past the empty .0 segment (analyze, rle_delta,
alter_table_aocs2, gporca, index2, uao_dml_row, uao_dml_column).

Disable prefetching for append-optimized relations; the row-number
"blocks" have no buffer-manager representation to prefetch.

Verified: ANALYZE on 100k-row AO (with index) and 50k-row AOCS tables
completes with sane reltuples.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Commit 943a423 restored system_functions.sql in initdb, whose
upstream-PG14 CREATE OR REPLACE defines pg_terminate_backend(pid int4,
timeout int8 DEFAULT 0).  pg_proc.dat still carried the pre-PG14
single-argument signature for oid 2096, so instead of replacing it the
bootstrap created a third pg_terminate_backend (alongside GPDB's
(int4, text) message variant), and every plain
pg_terminate_backend(pid) call failed with

    ERROR:  function pg_terminate_backend(integer) is not unique

(gp_sync_lc_gucs, session_reset).  The C function is already PG14's
two-argument version -- with the one-argument catalog entry it even read
an uninitialized timeout argument.

Update oid 2096 to the upstream signature; system_functions.sql now
replaces it in place and supplies the default.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…PG14)

The PG14 merge left refresh_by_match_merge() half upstream, half GPDB:

- The generated DELETE/diff queries used bare ctid row addressing;
  on a distributed matview the parser rejects that ("DELETE uses
  system-defined column ctid without ... gp_segment_id").  Restore
  GPDB's (ctid, gp_segment_id) pairing: the diff table stores both tid
  and sid and the DELETE matches on the pair (matview, matview_ao).

- Upstream stores the new data in the diff table as a whole-row record
  column and inserts via (_$diff._$newdata).*; an anonymous record's
  typmod is not registered on other nodes, so reading it back from the
  distributed temp table failed with "record type has not been
  registered".  Store expanded columns (_$newdata.*) instead and build
  the INSERT from an explicit column list, as pre-merge GPDB did.

- The merge had also left both INSERT forms concatenated in one buffer,
  producing syntactically invalid SQL.

Verified: concurrent refresh of a distributed matview applies updates,
inserts and deletes correctly (changed group recomputed, new group
inserted, others preserved).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Upstream PG14 partitioned tables accept no storage options
(RELOPT_KIND_PARTITIONED is empty), but GPDB stores options like
fillfactor and analyze_hll_non_part_table on the root and its partition
DDL propagates them to children; pre-PG14 GPDB roots were plain tables
and always accepted them.  CREATE/ALTER on partitioned roots failed
with "unrecognized parameter" (incremental_analyze, alter_table_set).

Validate partitioned-root reloptions against RELOPT_KIND_HEAP.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…G14)

CreateFunction()'s parse analysis of a SQL-standard body (BEGIN ATOMIC
/ RETURN) runs transformStmt() on the raw sql_body, which scribbles on
it; GPDB then dispatched the mutated statement to the QEs.  Execute a
copy and dispatch the original, following ExecDropStmt's
copy-before-execute pattern.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s (PG14)

CreateFunction() performs parse analysis of a SQL-standard body (BEGIN
ATOMIC / RETURN) with transformStmt(), which scribbles on the raw tree
(a SubLink's subselect is replaced with the transformed Query), and then
dispatches that mutated statement from its internal
CdbDispatchUtilityStatement.  The QEs' own parse analysis then failed
with "unexpected non-SELECT command in SubLink" for RETURN (SELECT ...)
bodies (create_function_3, rowsecurity).  Snapshot the statement at
function entry and dispatch the pristine copy.

A mechanical writer/reader field-symmetry scan of the raw-grammar nodes
also found:

- _readLockingClause missing the waitPolicy enum that the writer emits:
  every dispatched raw FOR UPDATE/SHARE clause desynchronized the
  stream by one enum (NOWAIT/SKIP LOCKED silently became garbage).
- WindowDef, RangeFunction, XmlSerialize and TableLikeClause had
  binary writers but no readers at all.

Verified: RETURN (SELECT count(*) FROM view) evaluates on segments;
BEGIN ATOMIC procedures dispatch; a SQL-body function containing FOR
UPDATE SKIP LOCKED round-trips.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…G14)

A query touching a partitioned table whose partition key uses a custom
operator class (regression test create_table: PARTITION BY RANGE
(a test_int4_ops)) crashed the coordinator inside ORCA, killing every
regress run at that point.  The metadata translation of such a key
raises mid-retrieval, and the optimizer state left behind is corrupted
enough that subsequent accesses SIGSEGV at varying sites (CMDTypeInt4GPDB
construction, CAutoTraceFlag, CSerializableStackTrace, CMDAccessor
hashtable) -- under gdb the faulting pointers differ run to run.

Detect the situation in standard_planner() before ORCA engages: walk the
range table and compare each partitioned table's partition opfamilies
against the default opclass for the column type; on mismatch skip ORCA
and use this planner, which handles custom partition opclasses fine.

Verified: the create_table sequence (custom btree opclass, partitioned
table, inserts, selects) runs to completion with "Postgres query
optimizer" plans; the coordinator survives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hardening extracted from debugging the custom-partition-opclass crash;
each of these turned a NULL-pointer SIGSEGV into defined behavior:

- GPOS_FTRACE / GPOS_SET_TRACE / GPOS_UNSET_TRACE dereferenced
  ITask::Self() unguarded; with no current task the coordinator
  crashed.  No task now means no trace flags (read as unset, set as
  no-op).
- CAutoTraceFlag's ctor/dtor called ITask::Self()->SetTrace() directly;
  same guard.
- CWorkerPoolManager's single worker slot made nested gpos_exec
  destructive: the inner worker overwrote the slot and its removal left
  the OUTER task with ITask::Self() == NULL for the rest of its
  lifetime.  Workers now save and restore the previous registration.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
67 expected files (the _optimizer variants and tests whose expected
content predates the PG14 test renewals) regenerated from the output of
a clean full run (fresh container, pristine build, mirrored demo
cluster, 28 fix commits) at 399 ok / 239 failed of 638.

Selection was restricted to vetted categories: diffs against an
_optimizer variant or one-sided diffs (new PG14 query blocks missing
from expected), and only tests whose results introduce no ERROR lines.
Tests failing with real errors and value-level mismatches outside these
categories remain untouched for individual review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
48 more vetted regenerations (uao_compaction/uaocs_compaction/uao_ddl
and other slash-named suites missed by batch 1 due to name handling),
same selection rules: _optimizer variants or one-sided new-content
diffs with no ERROR lines in results.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The merge stitched together upstream PG14's early pg_trigger open (for
the CREATE OR REPLACE TRIGGER duplicate scan and OID generation) with
GPDB's pre-existing open for dispatch-aware OID assignment: two
table_open(TriggerRelationId) per trigger creation, one table_close.
Every FK/constraint trigger creation leaked a relcache reference
("WARNING: relcache reference leak: relation pg_trigger not closed" on
the QEs, four per FK), and each creation burned an extra OID via the
now-dead upstream GetNewOidWithIndex call whose result GPDB's
GetNewOidForTrigger immediately overwrote -- including in the OR
REPLACE case, where the old trigger's OID must be kept.

Use a single open: GPDB's dispatch-aware OID assignment moves into the
!trigger_exists branch (REPLACE keeps the existing OID, as upstream
intends), keyed identically on QD and QE.

Verified: CREATE TABLE ... REFERENCES emits only the intended GPDB
FK-not-enforced warning; no leak warnings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
VACUUM and \d output name pg_aoseg_<oid>/pg_aocsseg_<oid>/
pg_aovisimap_<oid>/pg_aoblkdir_<oid>; several expected files have
specific oids baked in, failing every run. Normalize both sides via
init_file matchsubs instead of regenerating new baked oids.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
38 tests reviewed individually via first-hunk digests; all diffs are
format or upstream-semantics drift: the PG14 \d+ "Compression" column,
reltuples=-1 sentinel for never-analyzed relations, renamed spgist
operators, psql connection-error prefixes, new GUCs in listings,
subscription/replication-slot new columns, test-content renewals
(opr_sanity helper removal, partition_aggregate extra aggregate,
SELECT INTO echo), and modernized plan shapes under GP_IGNORE.

Excluded and tracked separately: value-level suspects (sequence_gp
cache, complex boolean flip, alter_table_gp count, AO compaction index
stats, vacuum_full_ao segment size, pg_stat_last_operation
DETACH/ATTACH, partition_join pruning), unordered mega-result reorders
(qp_derived_table, qp_olap_window*), tests whose only residue is the
just-fixed pg_trigger leak or AO-oid warnings (self-resolving), the
serializable-AO error removal (possible merge-lost guard), and
environment-baked segment dumps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Batch 1 was regenerated from output predating the CreateTrigger
relcache-leak fix, baking the "relcache reference leak" warnings into
23 expected files; the fixed binary no longer emits them. Regenerated
from clean output; a strict scan verified the diffs contain only the
leak-warning removals (and masked AO-oid lines).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GPDB forbids UPDATE/DELETE on append-optimized tables in SERIALIZABLE/
REPEATABLE READ transactions (the visimap machinery cannot honor a
fixed snapshot). Commit 13c98be moved that check into
ExecInitModifyTable, and the PG14 rework dropped it entirely -- such
statements silently ran without the safety (uao_dml expected errors
vanished). Re-add the check at the result-relation validation loop.

Verified: UPDATE under SERIALIZABLE and DELETE under REPEATABLE READ on
an AO table error as before.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n (PG14)

GPDB extends INCLUDING STORAGE to carry the source access method and
reloptions (appendonly orientation, compression, blocksize); the PG14
deferred-LIKE rework dropped it, so LIKE of an AO table silently created
a heap table (create_table_like_gp). Restore the carry-over at parse
time in transformTableLikeClause, before DefineRelation, unless the new
table specifies its own AM/options.

Verified: LIKE ... INCLUDING STORAGE of an ao_column+zlib table yields
ao_column with compresstype=zlib; bare LIKE still yields heap.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The MPP-6929 metadata tracking for DETACH lived only in
ATExecDetachPartitionFinalize, i.e. the DETACH CONCURRENTLY ... FINALIZE
entry point; a plain ALTER TABLE ... DETACH PARTITION (which calls
DetachPartitionFinalize directly) logged nothing, so
pg_stat_last_operation kept reporting the partition's original ATTACH
(pg_stat_last_operation test). Move the tracking into
DetachPartitionFinalize, shared by both paths.

Verified: after a plain DETACH the partition's last operation reads
PARTITION/DETACH.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
dimoffon and others added 30 commits June 21, 2026 23:08
test_GetNewTransactionId_xid_warn_limit exercises the warn-limit path,
which (unlike the stop-limit path that ereport(ERROR)s first) continues
into the XID-assignment code.  There GetNewTransactionId() indexes
ProcGlobal->xids[MyProc->pgxactoff] and
ProcGlobal->subxidStates[MyProc->pgxactoff].  The test left its stack
PGPROC uninitialized, so pgxactoff was garbage; it happened to be 0 (a
valid index into the size-1 arrays) before, but the PG15 PGPROC layout
change turned it into an out-of-bounds index, segfaulting the test and
breaking `make unittest-check` in CI.

Zero-initialize the stack PGPROC and PROC_HDR so pgxactoff is 0 and
MyProc->subxidStatus is empty (satisfying the asserts in that path).  The
test logic is otherwise unchanged and now passes 5/5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pile fix)

PG15 moved the WAL record's main data out of XLogReaderState into the
decoded record (XLogReaderState->record, a DecodedXLogRecord); the main
data is now reached via XLogRecGetData()/XLogRecGetDataLen() which
dereference record->record->main_data[_len].  cdbappendonlyxlog_test.c
still assigned mockrecord->main_data directly, so it failed to compile
("'XLogReaderState' has no member named 'main_data'"), breaking
`make unittest-check`.

Point the mock reader at a stack DecodedXLogRecord (header + main_data +
main_data_len, max_block_id = -1) so ao_insert_replay/ao_truncate_replay
read the data through the PG15 accessors.  Tests pass 2/2.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (PG15)

Back-to-back segwalrep failover/recovery tests race a just-promoted or
just-recovered segment that is still transiently unavailable.  Two
distinct race classes were diagnosed:

1. The direct "NU: select 1" promotion-waits connect to the freshly
   promoted mirror before it finishes recovery and fail with
   "FATAL: the database system is not accepting connections".  This raw
   connection rejection is NOT covered by gp_gang_creation_retry.  Add a
   plpython helper wait_until_segment_accepts_connections(content_id) that
   polls pg_isready against the content's current primary (nudging FTS)
   until it is ready, and call it before the 1U/0U promotion-waits in
   recoverseg_from_file.

2. gprecoverseg's gang creation in mirror_promotion fails with
   "Segments are in reset/recovery mode" because a segment is still
   recovering.  mirror_promotion was missing the gp_gang_creation_retry
   bump that twophase_tolerance_with_mirror_promotion and
   failover_with_many_records already use; add it (120 x 1000ms ~= 120s)
   via gpconfig + gpstop -u, reset at the end.

The default gp_gang_creation_retry is only 5 x 2s = 10s, too short for an
in-order run.  Note: keep plpython helper bodies comment-free and free of
any trailing ';' -- the isolation2 harness splits commands on ';' at end
of line, which corrupts the function definition.

mirror_promotion's second, fault-injection scenario can still flake in
back-to-back in-order runs (the whole cluster goes transiently into
reset/recovery, where even coordinator-only helper queries error
uncatchably); that residual is environmental and tracked separately.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Upstream PG15 commit cc50080 ("Rearrange core regression tests to
reduce cross-script dependencies") moved the shared C helper functions,
including binary_coercible(), into test_setup.sql, which runs before
create_function_0.  The PG15 merge kept test_setup.sql's definition but
did not remove create_function_0's now-duplicate one, so create_function_0
failed with:

    ERROR:  function "binary_coercible" already exists with same argument types

That failure (and the resulting missing-object cascade into downstream
tests) shows up in any regression run, and dominated the JIT
(jit=on jit_above_cost=0) installcheck matrix.  Remove the duplicate from
both the input and output .source templates; test_setup.sql's definition
(which runs first) serves every later test.  Verified: test_setup,
create_function_0, create_function_c and opr_sanity (a binary_coercible
consumer) all pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…_conversion, greenplum test_setup)

Two more PG15-merge dedup misses that broke `make installcheck`
(installcheck-good runs parallel_schedule then greenplum_schedule in one
database), surfaced while triaging the JIT installcheck matrix:

* test_enc_conversion() is created by conversion.sql in PG15 (upstream
  commit cc50080), but create_function_0.source still defined it too,
  so conversion failed "function already exists".  Remove it from
  create_function_0 (only conversion uses it, and it creates its own).

* test_setup is the first test of parallel_schedule (PG15 upstream); the
  merge also added it to greenplum_schedule.  Since installcheck-good runs
  parallel_schedule first in the same database, the greenplum copy's CREATE
  TABLEs failed "already exists" and its INSERTs double-loaded the shared
  read-only tables, cascading into the greenplum tests.  Drop it from
  greenplum_schedule; parallel_schedule's run serves both.

With these plus the earlier binary_coercible dedup, the core
parallel_schedule passes under JIT (jit=on jit_above_cost=0) except misc;
the remaining greenplum_schedule failures are pre-existing PG15 answer
drift unrelated to JIT.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PG15 adopted upstream commit 6867f96, which hardened
pg_get_expr_worker() by walking the input node with pull_varnos() to
reject expressions containing Vars. The gp_partition_template.template
catalog column stores a serialized GpPartitionDefinition node tree (a
GPDB-specific node with no Vars); expression_tree_walker() has no case
for the GPDB partition nodes, so pull_varnos() errored with
"unrecognized node type: 740". The deparse path already handles
T_GpPartitionDefinition (get_rule_expr), so skip the Var-safety check
for it, restoring pre-PG15 behavior.

Fixes regress tests AOCO_Compression, bfv_partition, column_compression,
gp_partition_template, partition (opt=on and opt=off).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PG15 adopted upstream's ATWrongRelkindError path in ATSimplePermissions,
which calls alter_table_type_to_string(cmdtype) and, when it returns
NULL, falls through to the internal-error elog "invalid ALTER action
attempted on relation". The GPDB-specific AT_ExpandTable /
AT_ExpandPartitionTablePrepare values had no case, so EXPAND TABLE on a
wrong relkind (e.g. a view) raised that internal error instead of a
clean message. Add the cases and regenerate expand_table.out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PG15 merge left two complete SELECT...FROM pg_class blocks in
getTables(), both appending to the same query buffer, producing
malformed SQL. The first block was upstream PG15's rewritten minimal
query (no WHERE/ORDER/execute); the second is the GPDB query that the
result parsing actually depends on (it reads distclause, parrelid,
parlevel, relstorage, partclause, parttemplate via PQfnumber) and which
carries the full FROM/JOINs/WHERE/ORDER and ExecuteSqlQuery. Drop the
leftover upstream block, keeping the GPDB query.

Fixes regress test pg_dump_binary_upgrade (opt=on and opt=off).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PG15 createdb rewrite (WAL_LOG/FILE_COPY strategies) dropped two
GPDB pieces from the new strategy functions:
  - ScheduleDbDirDelete(): registers the destination DB directory on the
    PendingDBDelete list so a failed/aborted CREATE DATABASE removes it
    (GPDB removes the dir via pending-deletes; createdb_failure_callback's
    upstream remove_dbtablespaces is #if 0'd out for GPDB).
  - the create_db_after_file_copy / after_xlog_create_database fault
    injection points used by the createdb regress test.
Re-graft both into CreateDatabaseUsingFileCopy (and ScheduleDbDirDelete
into CreateDatabaseUsingWalLog), matching adb-6.x.

PG15 defaults to the wal_log strategy, but these faults are file-copy-path
specific, so the createdb test requests STRATEGY file_copy for every fault
case (db_with_leftover_files, db2, db3, db4).  This also avoids a buffer
leak: the wal_log strategy copies relations through the shared buffer
cache, and createdb_failure_callback only drops those buffers when an error
is caught inside its PG_ENSURE_ERROR_CLEANUP block.  db4's CASE 4 aborts via
an end_prepare_two_phase panic during 2PC commit -- after the block has
ended -- so the callback never runs; with wal_log, ScheduleDbDirDelete would
then unlink the directory while its dirty buffers remained, orphaning them
('could not write block ... No such file or directory').  file_copy uses
copydir (no buffer-cache load), so there are no buffers to orphan.

Fixes regress test createdb (opt=on and opt=off).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A backend that spilled to a shared work file (FileSet) crashed on exit:
pgstat_shutdown_hook() runs as a before_shmem_exit callback and tears down
the cumulative-stats state, then dsm_backend_shutdown() (an on_shmem_exit
callback, so it runs later) detaches the DSM segment, whose cleanup deletes
the FileSet's temporary files:

  pgstat_report_tempfile        <- asserts pgstat is up (pgstat.c:1227)
  ReportTemporaryFileUsage
  PathNameDeleteTemporaryFile
  FileSetDeleteAll
  dsm_detach
  dsm_backend_shutdown

Reporting temp-file usage after the stats subsystem is shut down trips
pgstat_assert_is_up() under assertions, and touches detached stats shared
memory in any build.  Because only backends that spilled hit this, and more
queries spill under load, it crashed segments intermittently during
installcheck-good and cascaded ('terminating connection because of crash of
another server process') into dozens of unrelated tests.

Skip the temp-file stats report once proc_exit is in progress; per-file
accounting is moot for a backend that is leaving, and query-time temp-file
deletions still report normally (proc_exit_inprogress is false then).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
COPY ... IGNORE EXTERNAL PARTITIONS over a partitioned table that has an
external (foreign) partition crashed the QD planner with
FailedAssertion("child_rel != NULL", planner.c:9046).

expand_partitioned_rtentry() walks live_parts and, for the GPDB
"skip foreign partitions" hack, does 'continue' for a foreign partition
without building its part_rels[] entry -- but it left that partition's index
in live_parts.  PG15's apply_scanjoin_target_to_paths() now iterates
live_parts and asserts a non-NULL part_rels[] entry for every live member
(PG14 tolerated NULL slots), so the skipped external partition tripped the
assert (and would dereference a NULL RelOptInfo in a non-assert build).

Remove the skipped partition from live_parts too, keeping live_parts and
part_rels[] consistent for all downstream consumers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A QD backend could SIGSEGV in mppExecutorCleanup():

  mppExecutorCleanup        estate->dispatcherState  (execUtils.c:1727)
  standard_ExecutorStart    PG_CATCH (execMain.c:335)
  PortalStart / exec_simple_query

standard_ExecutorStart() runs the resource-manager operator-memory
assignment (PolicyAutoAssignOperatorMemoryKB / PolicyEagerFree...) inside a
PG_TRY, and on error calls mppExecutorCleanup() from the PG_CATCH.  That
assignment happens before queryDesc->estate is created, so mppExecutorCleanup
dereferenced a NULL estate and crashed the backend -- which on the QD takes
down the whole coordinator (crash recovery) and cascades into concurrently
running tests.  It was hit intermittently under load by a complex query whose
operator-memory needs exceeded the (contention-reduced) query_mem, e.g. the
psql \d publications query (a 3-way UNION) -- 'insufficient memory reserved
for statement' was thrown during executor start.

Return early from mppExecutorCleanup() when queryDesc->estate is NULL: the
executor state was never built, so there is nothing to tear down, and the
original error then propagates as a normal ERROR.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GPDB extends SQL window frames to allow non-constant (column-valued) start/end
offsets, e.g. ROWS BETWEEN <expr> PRECEDING where <expr> references the current
row.  compute_start_end_offsets() only (re)computes an offset when its
start_offset_valid / end_offset_valid flag is false, and those flags mean
"valid for the current row".  The PG15 merge kept upstream's advance-current-row
and begin_partition logic (which resets framehead_valid/frametail_valid but not
the GPDB offset-valid flags, since upstream offsets are constant), so the flags
were only ever set true and never reset.  The frame offset was therefore frozen
at the first row's value, producing wrong window-aggregate results whenever the
offset varied across rows (e.g. SUM over ROWS BETWEEN off PRECEDING gave the
off-of-the-first-row frame for every row).

Re-graft the per-row resets (matching adb-6.x): in begin_partition() and when
advancing the current row -- unconditionally for RANGE framing, and for
ROWS/GROUPS only when the offset is not var-free (non-constant).

Fixes regress test qp_olap_window (and related non-constant-frame cases).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PG15 merge mangled describeRoles() in psql: the three GPDB-specific
role columns (rolcreaterextgpfd/rolcreatewextgpfd/rolcreaterexthttp) were
appended only inside a verbose-only branch (which also contained a duplicate
query and a duplicate FROM clause), but the result loop reads them as columns
9/10/11 unconditionally.  So plain \du built an 11-column (0..10) query and
the loop read a non-existent column 11 -> libpq "column number 11 is out of
range", failing \du / \d / \dE / \df+ etc.

Restore the adb-6.x structure: append the three GPDB columns unconditionally
right after the base SELECT (cols 9,10,11), keep only the description column
in the verbose branch, and drop the duplicate query/FROM.  Also fix the
"Bypass RLS" read to add the +numgreenplumspecificattrs offset that the
"Replication" read already had (it was reading rolcreatewextgpfd instead of
rolbypassrls -> wrongly printing "Bypass RLS" for CREATEEXTTABLE-writable
superuser roles, which have rolbypassrls=f).

Regenerate expected/psql_gp_commands.out for the two corrected \du rows
(test_psql_du_e3/e4 no longer show a bogus "Bypass RLS").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…owner

Two PG15 adaptations make gp_owner_permission green:

1. The test creates its own database reindexdb2 and then creates tables in it
   as non-superuser test1.  PG15 no longer grants CREATE on schema public to
   PUBLIC in a freshly created database, so those CREATEs failed with
   "permission denied for schema public".  Add GRANT ALL ON SCHEMA public TO
   public right after \c reindexdb2 (mirroring regress test_setup.sql, which
   already does this for the regression DB).  The reindex-by-database-owner
   behavior under test is unchanged; only the setup precondition is restored.

2. Regenerate expected: the recorded REINDEX user is now test1 (the table
   owner) instead of test2 (the actor).  This is correct upstream PG behavior —
   reindex_index() switches to the table owner's userid (index.c, "Switch to
   the table owner's userid, so that any index functions are run" as that
   owner) so functions in index expressions can't escalate.  REINDEX DATABASE
   run by test2 rebuilds test1's indexes as test1, so pg_stat_operations
   records test1.  No success->error, no lost rows (gate clean).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PG15 merge took upstream's unconditional "skip all internal triggers"
in CloneRowTriggersToPartition() and dropped GGDB's refinement.  Unlike
upstream (which marks partition-propagated row triggers tgisinternal=false),
GGDB marks them tgisinternal=true, so the unconditional skip drops them: when
ADD/SPLIT/EXCHANGE PARTITION creates a sub-partition under an intermediate
partition, the clone from that intermediate parent finds only its own
(internal) inherited trigger, skips it, and the grandchild never receives the
trigger -- relhastriggers stays false on the second-level sub-partitions
(seen in the partition test's part_inherit cases).

Restore the adb-8.x condition: only skip an internal trigger when the parent
is not itself a partition, or the trigger is not a partition-inherited clone
(no tgparentid).  Partition-inherited row triggers (internal, tgparentid set)
are then cloned down to the new sub-partition.

Verified: minimal repro (multi-level part table + AFTER INSERT row trigger +
ALTER TABLE ADD PARTITION) now sets relhastriggers=true on the l2
grandchildren; the full partition regress test passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…_sleep

The QD-combine of per-segment tuple stats (pgstat_combine_one_qe_result) is
correct, but the test relied on 'select pg_sleep(0.77)' to let an implicit
pgstat_report_stat() flush the QD's pending cumulative stats to shared memory
before reading pg_stat_all_tables_internal.  PG15 rewrote the stats system to
shared memory and rate-limits flushes (PGSTAT_MIN_INTERVAL = 1000ms), so 0.77s
is below the threshold and the read raced -> n_tup_ins etc. came back 0.

Replace each pg_sleep with PG15's pg_stat_force_next_flush() (the idiomatic way
to make pending stats deterministically visible).  For the matview cases the
cross-segment pg_stat_all_tables gather needs every segment flushed
(pg_stat_force_next_flush() via gp_dist_random), while the QD-local
pg_stat_all_tables_internal read needs the QD's own combined stats flushed --
so emit both a QD force and a segment force there (previously only the segment
side was nudged, leaving the REFRESH's combined inserts unflushed: 51 vs 153).

Test-only change (no backend change); pgstat_qd_tabstat now passes
deterministically (and runs in <1s instead of ~20s of sleeps).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PG15 removed the UDP statistics collector that aggregated per-resource-queue
counters (n_queries_exec/wait, elapsed_exec/wait) across backends, leaving
pgstat_report_queuestat() a stub and pgstat_fetch_stat_queueentry() returning
NULL -> pg_stat_resqueues always read 0 (resource_queue_stat, gp_toolkit).

Re-implement on the PG15 shared-memory cumulative stats system by adding a new
variable-numbered, OID-keyed (global) stats kind PGSTAT_KIND_RESQUEUE, modeled
on replication-slot stats (synchronous report, no pending/flush callback):

  - pgstat.h: add the enum value (among the variable-numbered kinds) and bump
    PGSTAT_FILE_FORMAT_ID so the pre-existing stats file is discarded across
    the upgrade restart.
  - pgstat_internal.h: PgStatShared_Resqueue (header + PgStat_StatQueueEntry).
  - pgstat.c: pgstat_kind_infos[PGSTAT_KIND_RESQUEUE] (variable, global,
    no pending/flush -- updated synchronously).
  - pgstat_gp.c: pgstat_report_queuestat() now forwards each portal's local
    accounting into the shared per-queue entry (pgstat_get_entry_ref_locked +
    accumulate + unlock) at statement end, then resets the local counters;
    pgstat_fetch_stat_queueentry() reads via pgstat_fetch_entry().

The backend-local per-portal hash (elapsed-time tracking) is unchanged.
resource_queue_stat now passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gp_toolkit creates its own database toolkit_testdb and, as the non-superuser
toolkit_user1, creates public.toolkit_usertest there.  PG15 no longer grants
CREATE on schema public to PUBLIC in a freshly created database, so that CREATE
failed with "permission denied for schema public" and the table was absent
from the __gp_user_data_tables_readable check.  Grant it right after
\c toolkit_testdb (mirroring regress test_setup.sql for the main database).

The resource-queue-stats portion of this test is fixed separately by the
PGSTAT_KIND_RESQUEUE change; with both, gp_toolkit passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the partition trigger-clone fix: CloneRowTriggersToPartition()
must keep skipping constraint-backed internal triggers (foreign-key / RI
triggers, tgconstraint set) -- the constraint cloning code
(CloneForeignKeyConstraints) re-creates those.  My earlier re-graft relied on
the tgparentid test to exclude them, which held in PG14 (adb-8.x) but not in
PG15: PG15 sets tgparentid on partition-inherited FK triggers, so they slipped
through and got cloned a second time.  The duplicate RI triggers then failed at
runtime with "constraint N is not a foreign key constraint" (seen in the
truncate test's FK-on-partitioned-table case).

Add an explicit OidIsValid(tgconstraint) skip.  Non-constraint partition-
inherited row triggers (tgconstraint = 0) are still cloned, so the
relhastriggers propagation to grandchildren (partition test) is preserved.
Verified: truncate passes again; the pit repro still shows relhastriggers=true
on the second-level sub-partitions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Consequence of the partition trigger-clone fix: GGDB propagates a partitioned
table's row triggers to ALL descendant levels (including grandchild
sub-partitions), whereas upstream PG15 does not via the clone path.  The merge
kept upstream's expected output, so the regress triggers test diverged in three
partition-trigger scenarios once the clone was restored:

  - trigpart: the trg1 trigger now also appears on the second-level
    sub-partitions trigpart41/trigpart42 (5->7 and 4->6 rows) -- matches adb-8.x.
  - trgfire:  trgfire4_30 now carries tg, so it shows in the per-partition
    tgenabled listing (7->8 rows), ENABLE TRIGGER no longer errors "does not
    exist" on it, and inserts routed there fire tgf() (raise 'except').
  - trigger_parted: aft_row now exists on the leaf, so the DROP no longer
    errors "does not exist".

Regenerated only these deterministic blocks from the full installcheck-good
results; the unrelated flaky MPP row-orderings (tttest etc.) and the
gpdiff-normalized segment-address/pid noise were intentionally left untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pure deterministic PG15 output changes (no env-specific text, no flaky ordering),
regenerated/edited per-test from the full installcheck-good results:
  - default_parameters, gp_create_view: pg_get_viewdef() now emits unnamed
    columns as AS "?column?" (PG15 ruleutils deparse).
  - gp_prepared_xacts: PG15 psql -c echoes the BEGIN command tag.
  - external_table: CREATE INDEX on a foreign table now errors "cannot create
    index on relation" + DETAIL "...not supported for foreign tables".
  - generated: error file:line drift tablecmds.c:14810 -> :14814.
  - explain: PG15 EXPLAIN (FORMAT JSON) gains GGDB memory-accounting keys
    (work_mem object, Executor Memory/Max Memory, Work Maximum Memory; all 0).

Done as targeted per-block edits (not whole-file cp) to avoid baking the
gpdiff-normalized segment-address/pid noise + flaky MPP row-orderings that the
larger EXPLAIN-ANALYZE tests carry (those need CI-tarball regen).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- distributed_transactions: PG15 psql -c echoes the BEGIN/COMMIT/CREATE TABLE
  command tags for the multi-statement utility-mode string.
- rangefuncs_cdb: PG15 GUC-list churn (enable_resultcache -> enable_memoize,
  + enable_group_by_reordering; 21 -> 22 rows) and the materialize-mode SRF
  rejection wording ("set-valued function ..." -> "materialize mode required,
  but it is not allowed in this context").
- table_statistics: PG15 upstream removed the blanket rejection of CLUSTER on a
  partitioned table; the two CLUSTER-on-index commands now succeed, and the bare
  CLUSTER reports "there is no previously clustered index" instead.
- pg_resetwal: PG15 added CheckDataVersion(); the test built a fake data dir
  with PG_VERSION=14, which pg_resetwal 15 now rejects.  Write "15" (matching
  the cluster) so the positive checksum-version tests run again.

Targeted per-block edits from the full installcheck-good results; pg_resetwal
verified ok standalone.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PG15 changed VACUUM's relfrozenxid computation: an empty (or all-dead) table is
frozen all the way to the current XID, so age(relfrozenxid) right after VACUUM is
~0, whereas PG14 set relfrozenxid to FreezeLimit (current - vacuum_freeze_min_age)
giving an age ~50.  Verified with a minimal repro (empty heap, advance ~500 XIDs,
"set vacuum_freeze_min_age=50; vacuum" -> age master 1 / segments 0, not 50).

The test's classify_age() bucketed ages as zero(0) / very young(<50) /
young(<100) / old, and asserted 'young' after a plain VACUUM and 'very young'
after VACUUM FREEZE.  Under PG15 both land at ~0, which fell into the 'zero' /
'very young' buckets and the age-0-vs-1 boundary is timing-sensitive.

Collapse the three small-age buckets into a single 'young' = "recently processed
by VACUUM freeze" (age < 100).  This preserves the test's actual intent -- prove
VACUUM advanced relfrozenxid (young) vs. an untouched table (old) -- and is robust
across PG14's ~50 and PG15's ~0 with no boundary flakiness.  Regenerated the
expected accordingly (52 'very young' + 13 'young' -> 65 'young'; 'old' unchanged;
no success->error).  Verified green under both optimizer=off and optimizer=on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The "analyze behavior w.r.t. statistics" case checks pg_stat_get_live_tuples
/ pg_stat_get_dead_tuples after insert(×3)+delete on an AO/AOCO table and
expects 12 live / 18 dead.  It relied on two `select pg_sleep(0.6)` calls to
let an implicit pgstat_report_stat() flush the QD's pending cumulative stats
(the per-segment tuple counts combined by pgstat_combine_one_qe_result) to
shared memory at deterministic points: once before the final ANALYZE and once
before reading the stats.

PG15 rewrote the stats system to shared memory and rate-limits flushes
(PGSTAT_MIN_INTERVAL = 1000ms), so 0.6s is below the threshold and the flush
no longer happens reliably.  When the pending net +12 live delta is flushed
*after* ANALYZE has already SET live_tuples = 12 (its authoritative sample
value), the delta is ADDED on top, yielding 24.  Both the row and column
variants were flaky for this reason (observed 12 and 24 across runs); the
column variant failed consistently.

Replace each pg_sleep with PG15's pg_stat_force_next_flush() (the idiomatic
way to make pending stats deterministically visible), matching the earlier
pgstat_qd_tabstat fix.  Now reports 12/18 deterministically (verified across
many runs, both AM variants).  Test-only change; the QD-combine and ANALYZE
backend paths are correct.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ions

The PG15 merge left src/test/regress/{input,output}/misc.source badly
mangled — four independent breakages, all from the upstream PG15 reorg
(cc50080 "Rearrange core regression tests") being only partially
applied to the GGDB copy:

1. Orphaned DROP: the BTREE "non-functional updates" block has its two
   UPDATE tmp statements disabled (`/* */`, ORCA "multiple updates to a
   row" restriction) and the SELECT ... INTO tmp removed, but `DROP TABLE
   tmp;` was left live -> "table tmp does not exist". Moved `*/` past the
   DROP so the whole disabled block is one comment (upstream creates+drops
   tmp; GGDB can't run the UPDATEs under ORCA, so the bookends are moot).

2. Duplicated a_star inheritance stress test: PG15 moved this block into
   create_misc.sql (verified: adadae4 misc.sql has 0 `a_star*`, create_
   misc has it and renames a->aa + adds a text `a`). GGDB kept the old misc
   copy, so misc ran `SELECT * FROM a_star*` against the post-rename schema
   (`class | aa | a`) -> mismatch. Removed the duplicate from misc (it is
   byte-identical to create_misc's copy).

3. Dropped postquel infrastructure: PG15 misc.sql self-creates hobbies_r/
   equipment_r + the hobbies/equipment SQL functions; the GGDB merge dropped
   those CREATEs (they only survive in the unscheduled create_function_2),
   so the postquel queries errored "relation/column does not exist" instead
   of the GGDB QE-slice restriction the expected file checks. Restored the
   block from upstream PG15.

4. Dropped overpaid(emp): same story — `CREATE FUNCTION overpaid(emp)` (C,
   from regress lib) lived in unscheduled create_function_2; restored it to
   the top of misc (GGDB @libdir@/regress@DLSUFFIX@ form).

Regenerated output/misc.source from a clean run (seg-addr/pid suffixes
stripped per convention, @abs_builddir@/@libdir@ tokens restored). misc now
passes deterministically (3x) under BOTH optimizer=off and optimizer=on
(shared base .out, no misc_optimizer.out). Test-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…minism

Follow-up to 2c6981b. When I restored the postquel block from upstream
PG15, hobbies_by_name was the upstream form:
  AS 'select person from hobbies_r where name = $1'
That returns a scalar (hobbies_r.person%TYPE) from a multi-row match
('basketball' -> joe AND sally). On single-node PG the heap order is stable,
but hobbies_r is MPP-distributed so the "first" row is non-deterministic:
misc passed standalone (got 'sally') but the full installcheck-good run got
'joe'. GGDB's create_function_2 already carries the fix (with a comment:
"GPDB: use an order by to force the later test in 'misc' to return a
particular person, when multiple persons have the same hobby") — re-grafted
the `order by person` and its comment, and updated the expected to the
deterministic result ('joe'). misc now passes deterministically (3x) in both
standalone and full-schedule contexts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…build branch

gp_dqa failed under ORCA with optimizer_force_multistage_agg=on:
  ERROR: aggregate 2108 needs to have compatible input type and transition type
(2108 = sum(int4)). It also errored at EXPLAIN time (executor init).

ExecInitAgg picks the per-aggref transition function from aggref->aggsplit
(GGDB re-graft, with a comment: "check the aggref, not the node. ORCA can put
aggregates of different stages into one Agg node"): a combining aggref gets
transfn_oid = aggcombinefn. But the very next branch that decides HOW to build
the pertrans (combine path with combineFnInputTypes={transtype,transtype} vs
plain-transfn path with the aggregate's real input types) was still keyed on
the node-level aggstate->aggsplit.

For an ORCA multistage DQA plan, the final Agg node holds a COMBINE sum(int4)
aggref next to a single-stage count(distinct b); the node-level split is not
COMBINE, so the combining sum took the plain-transfn build path. That path runs
the strict/NULL-initval input-type check (nodeAgg.c build_pertrans_for_aggref
caller) which compares the aggregate's declared input type (int4) against the
transtype (int8) — not binary coercible — and errors. The planner builds the
same partial sum correctly, so opt=off was unaffected.

Root cause: the PG15 merge re-grafted the GGDB aggref->aggsplit change at the
transfn-choice site but took upstream's aggstate->aggsplit at the build-branch
site, leaving the two inconsistent. claude-merge-2 used aggref->aggsplit at
both. Fix: key the combine-build branch on aggref->aggsplit too (for upstream
single-split nodes aggref->aggsplit == aggstate->aggsplit, so no behavior
change there).

Verified: repro returns 10|55 (no error); gp_dqa green under optimizer on AND
off; aggregates (core agg path) green under both with full setup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
incremental_analyze failed: a plain ANALYZE / ANALYZE ROOTPARTITION on a
partitioned root sampled the root instead of merging the leaf statistics, so
the root's pg_stats showed direct-sample MCVs/histograms instead of the merged
values the test (and its own comment, "piggyback on the stats collected from
the leaf and merge them") expects.

leaf_parts_analyzed() iterates find_all_inheritors(), which includes the root
(and any mid-level partitioned tables). Its FIRST loop (the relpages/reltuples
check) correctly skips non-leaves with
  if (get_rel_relkind(partRelid) == RELKIND_PARTITIONED_TABLE) continue;
but the SECOND loop (the per-column fetch_leaf_att_stats check) was missing
that skip. It only avoided the root via the relTuples==0 short-circuit. Once a
prior merge ANALYZE sets the root's reltuples to the merged total (e.g. after
the first ANALYZE, then TRUNCATE which resets the leaves but not the root, or
an ANALYZE ROOTPARTITION after the leaves were analyzed), the root slips past
that short-circuit, fetch_leaf_att_stats() finds no own (stainherit=false)
column stats for it, and the function returns false -> merge is skipped ->
the root is sampled.

This was latent in claude-merge-2 (the root's reltuples happened to be 0 in the
failing scenarios); PG15's stats flow leaves the root with a non-zero reltuples
there, exposing it. Fix: give the second loop the same RELKIND_PARTITIONED_TABLE
skip as the first, so only true leaves are checked.

Root-caused with temporary elog instrumentation (now removed). Regenerated only
the three affected root rows of the ANALYZE-ROOTPARTITION-after-leaf-analyze
block in both expected files (targeted, not a full cp which would bake
gpdiff-normalized noise). incremental_analyze green under optimizer on AND off
(deterministic across runs).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…Y input

The two "negative" NEWLINE cases feed LF-terminated data to a COPY declared
`newline 'cr'`. GGDB's old monolithic copy.c CopyReadLineText carried an
EOL-aware end-of-copy-marker refinement (issue greenplum-db/gpdb#12454) that
reported this malformed input as "extra data after last expected column" with
the `\.` recognized as the end marker. The PG14 COPY split moved the live
FROM-parsing path to copyfromparse.c (upstream), where the #12454 logic does
not exist; that GGDB code now survives only in the unused copy.c copy.

The NEWLINE feature itself still works: valid input parses, and mismatched
input is still rejected — only the error differs. Text now errors
"end-of-copy marker does not match previous newline style"; CSV still errors
"extra data after last expected column" but the unrecognized `\.` appears in
the reported line. Both still reject (no success->error regression), and the
upstream message is a reasonable description of the mismatch.

Re-grafting #12454 into the upstream copyfromparse.c line reader would be
risky surgery on the core COPY parser for a malformed-input error message, so
instead accept the upstream behavior and regenerate the two negative-case
expected blocks (gpcopy is a .source test -> edit output/gpcopy.source, which
convert_sourcefiles regenerates expected/gpcopy.out from). Also drops the
baked seg-addr suffix that the new (QD-side) errors no longer carry. gpcopy
green, deterministic across runs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant