Skip to content

WONT-FIX: fix: update stale ACP session provider instead of deleting#26

Open
gitricko wants to merge 1 commit into
mainfrom
fix/acp-session-update-instead-of-delete
Open

WONT-FIX: fix: update stale ACP session provider instead of deleting#26
gitricko wants to merge 1 commit into
mainfrom
fix/acp-session-update-instead-of-delete

Conversation

@gitricko

Copy link
Copy Markdown
Owner

Problem

After every container reboot, the custom provider used by old ACP sessions no longer exists in the Hermes config. The extension cached these session IDs locally in VSCode workspaceState. On reconnect it tries session/load, but ...

The real root cause (found post-merge of PR #25):

The ACP server's normalize_result() adapter converts any None return from load_session into {} (empty dict):

def normalize_result(payload: Any) -> dict[str, Any]:
    if payload is None:
        return {}  # ← None becomes empty dict, which is TRUTHY in JS

The extension checks if (null != await call("session/load", ...)). Since an empty dict {} is truthy in JavaScript, the extension thinks the session loaded and logs "resumed" — but the agent never actually existed. Every subsequent session/prompt then fails with "session not found", creating a phantom session that never responds.

Deleting the session (previous fix PR #25) made this worse — the DB row is gone, but normalize_result still returns {} (truthy), so the extension stays stuck on the phantom session with no escape hatch.

Fix

Instead of deleting stale sessions, UPDATE them:

  1. Set billing_provider to the current configured provider (e.g. omniroute)
  2. Set billing_base_url to the current OmniRoute endpoint
  3. Set model_config to a valid config with the current provider
  4. Clear old messages so history replay is a clean slate

The session ID is preserved, so the extension's locally-cached ID still resolves. The server creates a working agent with the current provider. The extension resumes with a clean conversation.

How it works

On boot, self-check.sh queries ~/.hermes/state.db for ACP sessions (source='acp') whose billing_provider is no longer in the current config. For each stale session it runs:

UPDATE sessions SET billing_provider='omniroute', billing_base_url='http://localhost:20128/v1',
    billing_mode='chat_completions', model_config='{"provider":"omniroute","base_url":"http://localhost:20128/v1"}'
WHERE id='<stale-session-id>'
DELETE FROM messages WHERE session_id='<stale-session-id>'

Sessions with billing_provider still configured (e.g. omniroute, modelrelay) or None (uses current default) are left untouched.

Testing

Automated smoke test

cd /config/Documents/_code/hermes-webtop

# 1. Insert a stale ACP session
python3 -c "
import sqlite3, json
db = sqlite3.connect('/config/.hermes/state.db')
db.execute(\"\"\"INSERT OR REPLACE INTO sessions
  (id, source, billing_provider, billing_base_url, billing_mode,
   model, started_at, title, model_config)
  VALUES ('test-stale', 'acp', 'custom',
   'http://localhost:20128/v1', 'chat_completions',
   'auto-fastest', 1, 'Test',
   '{\"provider\":\"custom\",\"base_url\":\"http://localhost:20128/v1\"}')\"\"\")
db.commit()
db.close()
"

# 2. Run the cleanup
SKIP_CHECKS="services,models,mnemon,hermes,disk,memory,cron" \
  bash docker/self-check.sh 2>&1 | grep -A2 'Cleanup'
# Expected: "fixed 1 stale session(s), kept 0"

# 3. Verify session was KEPT with updated provider
python3 -c "
import sqlite3
db = sqlite3.connect('/config/.hermes/state.db')
r = db.execute('SELECT billing_provider, billing_base_url FROM sessions WHERE id=?', ('test-stale',)).fetchone()
print(f'provider={r[0]} base_url={r[1]}')
msgs = db.execute('SELECT COUNT(*) FROM messages WHERE session_id=?', ('test-stale',)).fetchone()
print(f'messages={msgs[0]}')
db.close()
"
# Expected: "provider=omniroute base_url=http://localhost:20128/v1  messages=0"

Real VSCode test

  1. Use the Hermes extension in VSCode for a while — create 2-3 ACP sessions by asking questions
  2. Open a terminal in the container and check ACP sessions exist:
    sqlite3 /config/.hermes/state.db "SELECT id, billing_provider, title FROM sessions WHERE source='acp'"
    
  3. Reboot the container via docker restart <container>
  4. Wait for code-server to come back, open VSCode
  5. Open the extension — it should auto-connect
  6. The session list should show old sessions (IDs preserved)
  7. Type a message in any session — it should respond (server creates agent with current provider, conversation starts clean)
  8. Verify the fix ran in boot logs:
    journalctl -u docker 2>/dev/null | grep -i "acp session"
    
    Or check the self-check output from the container boot log

Root cause: ACP server's normalize_result() converts None returns from
load_session into {} (empty dict). The extension sees {} as truthy and
assumes the session loaded, but every prompt then fails with 'session
not found' -- a ghost session that never responds.

Deleting the session from state.db (previous fix) made this worse:
the extension's locally-cached session ID still existed in VSCode
workspaceState, but the DB row was gone. normalize_result still
returned {} (truthy), so the extension stayed stuck on a phantom.

Fix: UPDATE the stale session's billing_provider to the current
configured provider and clear its old messages. The session ID is
preserved so the extension's cached ID resolves. The server creates
a working agent with the current provider. The extension resumes with
a clean conversation.

Closes #25 (replaces the delete approach)
@gitricko gitricko added the bug Something isn't working label Jun 19, 2026
@gitricko gitricko changed the title fix: update stale ACP session provider instead of deleting WONT-FIX: fix: update stale ACP session provider instead of deleting Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant