Skip to content

fix(octo): lease acquire/release must not bump updated_at (fixes restart loop)#28

Merged
lml2468 merged 1 commit into
mainfrom
fix/octo-lease-no-updated-at-bump
Jun 12, 2026
Merged

fix(octo): lease acquire/release must not bump updated_at (fixes restart loop)#28
lml2468 merged 1 commit into
mainfrom
fix/octo-lease-no-updated-at-bump

Conversation

@lml2468

@lml2468 lml2468 commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Problem (P0 regression introduced by #22, caught in E2E acceptance)

The reconfigure-detection added in #22 treats an advancing updated_at as a 'config changed' signal and restarts the supervisor. But AcquireOctoWSLease / ReleaseOctoWSLease also set updated_at = now(), so every 30s lease renewal advanced updated_at → the next sweep saw a phantom reconfigure → cancelled + restarted the supervisor → re-acquired → bumped updated_at again.

Result: the WS connection churned in a perpetual restart loop and the lease never stayed held (lease_held = f forever). The bot effectively never maintained its connection on the happy path.

Fix

Drop updated_at = now() from both lease queries. Lease acquire/renew/release is high-frequency operational churn, not a config change, so it must not advance updated_at — now only UpsertOctoInstallation (an actual reconfigure) does, which is exactly the signal the hub wants.

Generated octo.sql.go hand-synced (no param/column change → identical Go signature; repo has no sqlc binary or sqlc-drift CI).

Tests

TestOctoWSLease_DoesNotBumpUpdatedAt — acquire / renew / release all leave updated_at untouched (fails before this fix).

Verified live on the deployment: after restart the e2e install's lease holds steady, ws_lease_expires_at advances every renewal while updated_at stays frozen at create time, and 0 restart-loop log lines (vs. one every sweep before).

Found by running the full end-to-end smoke acceptance after merging the audit-fix series.

…art loop)

The reconfigure-detection added in #22 treats an advancing updated_at as
a 'config changed' signal and restarts the supervisor. But
AcquireOctoWSLease / ReleaseOctoWSLease also set updated_at = now(), so
every 30s lease renewal advanced updated_at → the next sweep saw a
phantom reconfigure → cancelled + restarted the supervisor → re-acquired
→ bumped updated_at again. Result: the WS connection churned in a
perpetual restart loop and the lease never stayed held (lease_held=f
forever). Caught during end-to-end acceptance.

Fix: drop 'updated_at = now()' from both lease queries. Lease
acquire/renew/release is high-frequency operational churn, not a config
change, so it must not advance updated_at — now only UpsertOctoInstallation
(an actual reconfigure) does, which is exactly the signal the hub wants.

Hand-synced the generated octo.sql.go (no param/column change → identical
Go signature). Regression test TestOctoWSLease_DoesNotBumpUpdatedAt
asserts acquire/renew/release leave updated_at untouched. Verified live:
lease now holds steady, lease_expires_at advances while updated_at stays
frozen, zero restart-loop log lines.
@lml2468 lml2468 merged commit 7f2eda0 into main Jun 12, 2026
4 checks passed
@lml2468 lml2468 deleted the fix/octo-lease-no-updated-at-bump branch June 12, 2026 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant