fix: auto-recover from stale automation DB migration on startup#1301
fix: auto-recover from stale automation DB migration on startup#1301jamiechicago312 wants to merge 6 commits into
Conversation
When upgrading openhands-automation between versions that restructure their Alembic migration history, the automation service crashes because the existing SQLite DB references a revision that no longer exists. This makes the Automate tab completely unavailable with no guidance. Changes: - Add onOutput/onExit callback options to spawnService() for output monitoring and exit handling - startAutomationBackend() now watches for migration error patterns in service output (e.g. 'migration failed', 'Can\'t locate revision') - On crash with a migration error, automatically deletes the stale DB and retries once, with clear log messages explaining the recovery - Extract getAutomationDbPath() helper to avoid duplicating the DB path computation across ensureDirectories and startAutomationBackend - Add tests for getAutomationDbPath Fixes OpenHands#1300 Co-authored-by: openhands <openhands@all-hands.dev>
|
@openhands-agent is attempting to deploy a commit to the openhands Team on Vercel. A member of the Team first needs to authorize it. |
✅ Mock-LLM E2E Tests54/54 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
jamiechicago312
left a comment
There was a problem hiding this comment.
Read over files, lgtm
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
✅ Mock-LLM E2E Tests54/54 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
Co-authored-by: openhands <openhands@all-hands.dev>
✅ Mock-LLM E2E Tests54/54 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
Co-authored-by: openhands <openhands@all-hands.dev>
✅ Mock-LLM E2E Tests54/54 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
✅ Mock-LLM E2E Tests54/54 passed Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
|
@OpenHands resolve the merge conflicts |
|
@jamiechicago312 it looks like you haven't created an OpenHands account yet. Please sign up at OpenHands Cloud and try again. |
|
I'm on it! jamiechicago312 can track my progress at all-hands.dev |
- Accept deletion of snapshot-tests.yml and snapshot test files (removed in main) - Resolve AGENTS.md conflict by removing the Visual Snapshot Testing section (removed in main) Co-authored-by: openhands <openhands@all-hands.dev>
|
Merge conflicts resolved. The following conflicts were addressed:
The core PR changes ( This comment was created by an AI agent (OpenHands) on behalf of the user. |
|
OpenHands encountered an error: **Failed to send message to agent server: HTTP 503 error: no available server See the conversation for more information. |
🛑 Mock-LLM E2E Tests59/59 passed · Commit:
Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses) |
|
🤖 OpenHands is reviewing this PR. Trigger label: This comment was posted by an AI agent (OpenHands). |
|
Thanks for the focused fix — the main One material gap remains:
Tests/verification:
Security-wise, I don’t see any new credential exposure or unsafe command construction in the changed path. 🔄 CHANGES REQUESTED This comment was posted by an AI agent (OpenHands). |
HUMAN:
I tested it and it worked for me.
AGENT:
This PR description was prepared with help from an AI agent (OpenHands) on behalf of the user.
Why
When upgrading
openhands-automationbetween versions that change Alembic migration history, the automation service can fail on startup because the local SQLite database points to a revision that no longer exists. That makes the Automate tab unavailable and forces developers to manually remove~/.openhands/automation/automations.db.Summary
Issue Number
Fixes #1300
How to Test
~/.openhands/automation/automations.db.npm cinpm test -- __tests__/scripts/dev-with-automation.test.tsnode scripts/dev-with-automation.mjsVideo/Screenshots
Not included.
Type
Notes
The snapshot workflow fix only skips the second Playwright pass when the comparison step already passed. It still regenerates PR snapshots when diffs or new snapshots are detected.