fix(octo): lease acquire/release must not bump updated_at (fixes restart loop)#28
Merged
Merged
Conversation
…art loop) The reconfigure-detection added in #22 treats an advancing updated_at as a 'config changed' signal and restarts the supervisor. But AcquireOctoWSLease / ReleaseOctoWSLease also set updated_at = now(), so every 30s lease renewal advanced updated_at → the next sweep saw a phantom reconfigure → cancelled + restarted the supervisor → re-acquired → bumped updated_at again. Result: the WS connection churned in a perpetual restart loop and the lease never stayed held (lease_held=f forever). Caught during end-to-end acceptance. Fix: drop 'updated_at = now()' from both lease queries. Lease acquire/renew/release is high-frequency operational churn, not a config change, so it must not advance updated_at — now only UpsertOctoInstallation (an actual reconfigure) does, which is exactly the signal the hub wants. Hand-synced the generated octo.sql.go (no param/column change → identical Go signature). Regression test TestOctoWSLease_DoesNotBumpUpdatedAt asserts acquire/renew/release leave updated_at untouched. Verified live: lease now holds steady, lease_expires_at advances while updated_at stays frozen, zero restart-loop log lines.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (P0 regression introduced by #22, caught in E2E acceptance)
The reconfigure-detection added in #22 treats an advancing
updated_atas a 'config changed' signal and restarts the supervisor. ButAcquireOctoWSLease/ReleaseOctoWSLeasealso setupdated_at = now(), so every 30s lease renewal advancedupdated_at→ the next sweep saw a phantom reconfigure → cancelled + restarted the supervisor → re-acquired → bumpedupdated_atagain.Result: the WS connection churned in a perpetual restart loop and the lease never stayed held (
lease_held = fforever). The bot effectively never maintained its connection on the happy path.Fix
Drop
updated_at = now()from both lease queries. Lease acquire/renew/release is high-frequency operational churn, not a config change, so it must not advanceupdated_at— now onlyUpsertOctoInstallation(an actual reconfigure) does, which is exactly the signal the hub wants.Generated
octo.sql.gohand-synced (no param/column change → identical Go signature; repo has no sqlc binary or sqlc-drift CI).Tests
TestOctoWSLease_DoesNotBumpUpdatedAt— acquire / renew / release all leaveupdated_atuntouched (fails before this fix).Verified live on the deployment: after restart the e2e install's lease holds steady,
ws_lease_expires_atadvances every renewal whileupdated_atstays frozen at create time, and 0 restart-loop log lines (vs. one every sweep before).