Bug
The web dashboard metadata-enrichment path uses the current project/default agent to enrich every session in a project, instead of using the agent persisted on each session record. In mixed-agent projects, a historical goose / gemini / grok / etc. session can be handed to the Codex plugin when the project default is currently Codex.
That causes two bad outcomes:
- Performance: Codex receives a non-Codex session with no
codexThreadId, falls back to cwd-based discovery, and scans ~/.codex/sessions.
- Correctness: because cwd is shared, Codex can find an unrelated recent Codex rollout and attribute that Codex model/thread/cost to the non-Codex session.
Source: local AO dogfooding / live debugging with ao_timeout_again.har and Next OOM logs.
Reported by: @yyovil
Date: 2026-05-22
Analyzed against: upstream/main d5249228498fc5cb8f14609fcd81baa3630e5076; installed #1994 worktree commit 63ad054566e706eed989d1d9ce0b696bda58a4b6; root checkout was dirty/ahead so latest was fetched but not pulled.
AO version: 0.8.0
Environment: macOS Darwin 25.4.0 arm64, Node v22.22.2, zsh.
Confidence: High — reproduced locally with one concrete session metadata file and traced to exact web serialization code.
Concrete local example
Problematic session metadata file:
~/.agent-orchestrator/projects/agent-orchestrator_48321dec7a/sessions/ao-22.json
Relevant fields:
{
"agent": "goose",
"worktree": "/Users/tanishqpalandurkar/Projects/agent-orchestrator",
"createdAt": "2026-05-20T12:48:10.866Z",
"status": "stuck",
"lifecycle": {
"session": {
"state": "terminated",
"reason": "runtime_lost"
}
}
}
This file has no codexThreadId because it is not a Codex session.
Current local project state:
project sessions: 69
worker sessions: 68
sessions needing summary and missing codexThreadId: 43
~/.codex/sessions JSONL files: 982
largest JSONL observed: 92 MB
The problematic rows are recent, not ancient metadata. They were created between:
2026-05-20T12:48:10Z and 2026-05-21T09:28:37Z
Local reproduction evidence
Direct project listing is not the slow part:
sessionManager.list(project): 243ms, RSS ~79 MB
But forcing Codex enrichment on a non-Codex session reproduces the slow and wrong path:
session: ao-22
persisted agent: goose
codexThreadId: undefined
workspacePath: /Users/tanishqpalandurkar/Projects/agent-orchestrator
codex.getSessionInfo(ao-22): 7681ms
The result returned unrelated Codex metadata:
{
"summary": "Codex session (gpt-5.5)",
"agentSessionId": "019e4b8d-b055-7501-845e-b60da93cb526",
"metadata": {
"codexThreadId": "019e4b8d-b055-7501-845e-b60da93cb526",
"codexModel": "gpt-5.5"
},
"cost": {
"inputTokens": 51287022,
"outputTokens": 134644,
"estimatedCostUsd": 129.563995
}
}
That thread id belongs to the current/recent Codex orchestrator, not to ao-22's stored goose session. The cwd fallback matched the shared repo path and selected a recent Codex rollout.
Root Cause
Core session loading preserves enough information to know the session's intended agent. session-manager.ts resolves persisted session selection before core enrichment:
const selection = resolveSelectionForSession(project, sessionId, repaired.raw);
const effectiveAgentName = selection.agentName;
const plugins = resolvePlugins(project, effectiveAgentName);
But the web serialization/enrichment layer does not use that persisted session agent. It resolves the project and picks the current project/default agent:
const agentName = projects[i]?.agent ?? config.defaults.agent;
const agent = registry.get<Agent>("agent", agentName);
return enrichSessionAgentSummary(dashboardSessions[i], core, agent);
So if a project default is now codex, every session without a summary can be passed to Codex, regardless of whether the session metadata says agent: goose, agent: gemini, agent: grok, etc.
Why this caused the observed timeout/OOM family
This bug interacts with #1991 and #1855:
- The session-detail page calls fresh session-list endpoints for sidebar/project zone counts.
/api/sessions?fresh=true bypasses listCached() and then web enrichment runs over many worker sessions.
- Web enrichment selects Codex for non-Codex rows because the project default is Codex.
- Codex sees no
codexThreadId, falls back to cwd-based discovery under ~/.codex/sessions.
- Multiple rows share the same repo cwd, so repeated request-path enrichment can repeatedly scan/parse unrelated Codex history.
- Client aborts/timeouts do not necessarily cancel server-side work already started.
This explains why #1992/#1994 improved real Codex sessions with codexThreadId, but did not fully fix the reload timeout/OOM: many rows are not Codex sessions at all, and the web layer is routing them into Codex anyway.
Reproduction
- Use a project where the current default/project agent is
codex.
- Have existing session metadata files in that project whose persisted
agent is not codex, e.g. agent: goose, and which have no persisted summary.
- Open the session detail page or call the project sessions API path that triggers web metadata enrichment.
- Observe
enrichSessionsMetadata(...) choosing the project/default Codex agent and calling codex.getSessionInfo(...) for the non-Codex session.
- On machines with large
~/.codex/sessions, this can take seconds per fallback and can produce wrong Codex summary/cost metadata for non-Codex sessions.
Fix
- Preserve/expose the session's persisted agent on the
Session object, or provide a core helper that resolves the effective agent for a loaded session.
- In
packages/web/src/lib/serialize.ts, use the persisted session agent for session-specific summary enrichment before falling back to project/default agent for truly legacy records.
- Do not call Codex
getSessionInfo() for a session whose persisted agent is known and is not Codex.
- Add regression tests with a mixed-agent project:
- default/project agent is
codex
- stored session metadata has
agent: goose
enrichSessionsMetadata(...) must not call Codex getSessionInfo() for that row
- Consider skipping agent summary enrichment entirely for terminal/runtime-lost rows unless persisted native metadata is already present.
Impact
- Dashboard/session-detail can time out or OOM on mixed-agent projects with a large Codex history.
- Non-Codex sessions can display incorrect Codex summaries/costs because cwd fallback may attach the wrong Codex rollout.
- This affects recent sessions too; it is not limited to pre-
codexThreadId legacy Codex metadata.
Related
Bug
The web dashboard metadata-enrichment path uses the current project/default agent to enrich every session in a project, instead of using the agent persisted on each session record. In mixed-agent projects, a historical
goose/gemini/grok/ etc. session can be handed to the Codex plugin when the project default is currently Codex.That causes two bad outcomes:
codexThreadId, falls back to cwd-based discovery, and scans~/.codex/sessions.Source: local AO dogfooding / live debugging with
ao_timeout_again.harand Next OOM logs.Reported by: @yyovil
Date: 2026-05-22
Analyzed against: upstream/main
d5249228498fc5cb8f14609fcd81baa3630e5076; installed #1994 worktree commit63ad054566e706eed989d1d9ce0b696bda58a4b6; root checkout was dirty/ahead so latest was fetched but not pulled.AO version:
0.8.0Environment: macOS Darwin 25.4.0 arm64, Node
v22.22.2, zsh.Confidence: High — reproduced locally with one concrete session metadata file and traced to exact web serialization code.
Concrete local example
Problematic session metadata file:
Relevant fields:
{ "agent": "goose", "worktree": "/Users/tanishqpalandurkar/Projects/agent-orchestrator", "createdAt": "2026-05-20T12:48:10.866Z", "status": "stuck", "lifecycle": { "session": { "state": "terminated", "reason": "runtime_lost" } } }This file has no
codexThreadIdbecause it is not a Codex session.Current local project state:
The problematic rows are recent, not ancient metadata. They were created between:
Local reproduction evidence
Direct project listing is not the slow part:
But forcing Codex enrichment on a non-Codex session reproduces the slow and wrong path:
The result returned unrelated Codex metadata:
{ "summary": "Codex session (gpt-5.5)", "agentSessionId": "019e4b8d-b055-7501-845e-b60da93cb526", "metadata": { "codexThreadId": "019e4b8d-b055-7501-845e-b60da93cb526", "codexModel": "gpt-5.5" }, "cost": { "inputTokens": 51287022, "outputTokens": 134644, "estimatedCostUsd": 129.563995 } }That thread id belongs to the current/recent Codex orchestrator, not to
ao-22's storedgoosesession. The cwd fallback matched the shared repo path and selected a recent Codex rollout.Root Cause
Core session loading preserves enough information to know the session's intended agent.
session-manager.tsresolves persisted session selection before core enrichment:But the web serialization/enrichment layer does not use that persisted session agent. It resolves the project and picks the current project/default agent:
So if a project default is now
codex, every session without a summary can be passed to Codex, regardless of whether the session metadata saysagent: goose,agent: gemini,agent: grok, etc.Why this caused the observed timeout/OOM family
This bug interacts with #1991 and #1855:
/api/sessions?fresh=truebypasseslistCached()and then web enrichment runs over many worker sessions.codexThreadId, falls back to cwd-based discovery under~/.codex/sessions.This explains why #1992/#1994 improved real Codex sessions with
codexThreadId, but did not fully fix the reload timeout/OOM: many rows are not Codex sessions at all, and the web layer is routing them into Codex anyway.Reproduction
codex.agentis notcodex, e.g.agent: goose, and which have no persisted summary.enrichSessionsMetadata(...)choosing the project/default Codex agent and callingcodex.getSessionInfo(...)for the non-Codex session.~/.codex/sessions, this can take seconds per fallback and can produce wrong Codex summary/cost metadata for non-Codex sessions.Fix
Sessionobject, or provide a core helper that resolves the effective agent for a loaded session.packages/web/src/lib/serialize.ts, use the persisted session agent for session-specific summary enrichment before falling back to project/default agent for truly legacy records.getSessionInfo()for a session whose persisted agent is known and is not Codex.codexagent: gooseenrichSessionsMetadata(...)must not call CodexgetSessionInfo()for that rowImpact
codexThreadIdlegacy Codex metadata.Related