Skip to content

fix: dispose connection pool on PGRES_TUPLES_OK to prevent worker deadlock#573

Merged
neoneye merged 2 commits intomainfrom
fix/dispose-pool-on-corrupted-connection
Apr 14, 2026
Merged

fix: dispose connection pool on PGRES_TUPLES_OK to prevent worker deadlock#573
neoneye merged 2 commits intomainfrom
fix/dispose-pool-on-corrupted-connection

Conversation

@neoneye
Copy link
Copy Markdown
Member

@neoneye neoneye commented Apr 14, 2026

Summary

  • When token_metrics_store encounters a PGRES_TUPLES_OK database error, the corrupted psycopg2 connection was returned to the pool via session.remove(). Other Luigi worker threads then hang forever on pool_pre_ping (SELECT 1), deadlocking the entire pipeline at ~3% progress.
  • Call db.engine.dispose() before session.remove() so the corrupted connection is closed outright and all threads get fresh connections.
  • Add pool disposal as defense in depth when _handle_task_completion callback exhausts all 3 DB retry attempts.

Test plan

  • Deploy to Railway and trigger an iframe plan from mach-ai.com
  • Verify pipeline completes (or if PGRES_TUPLES_OK recurs, verify workers recover instead of deadlocking)
  • Verify regular plans from home.planexe.org still complete normally

🤖 Generated with Claude Code

neoneye and others added 2 commits April 14, 2026 19:40
…dlock

When token_metrics_store encounters a PGRES_TUPLES_OK error, the
corrupted psycopg2 connection was returned to the pool via
session.remove(). Other Luigi worker threads checking out connections
would hang forever on pool_pre_ping (SELECT 1), deadlocking the
entire pipeline.

Fix: call db.engine.dispose() before session.remove() so the corrupted
connection is closed outright. Also add pool disposal as defense in
depth when the _handle_task_completion callback exhausts all DB retries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@neoneye neoneye merged commit bf87610 into main Apr 14, 2026
3 checks passed
@neoneye neoneye deleted the fix/dispose-pool-on-corrupted-connection branch April 14, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant