fix(tools): recover wrong-home absolute paths from non-Claude models#35
Merged
Conversation
…ive paths
Non-Claude models (DeepSeek) ignore the "path is relative to project root"
tool contract and invent an ABSOLUTE path with the wrong home dir — e.g. the
curator called glob with /Users/boberik/.../agent-bober-ide/src when the real
root is /Users/bober4ik/.../agent-bober-ide. The path sandbox correctly rejected
it ("resolves outside the project root"), but the warn is non-fatal so the agent
silently continued with degraded (empty) exploration results.
Three reinforcing fixes:
- sandboxPath (A): when an absolute path lands outside the root but still
contains the root's basename, re-anchor the suffix after it relative to the
real root and retry. Never widens the sandbox — the re-anchored path is
resolve(projectRoot, suffix) and is re-validated to be inside the root, so a
genuinely-foreign path (/etc/passwd) and traversal (../..) still fail closed.
Exported for direct unit testing.
- tool schemas (B): the read/write/edit/glob/grep path descriptions said
"relative to project root or absolute", actively inviting the bad behavior.
Tighten them to "pass a relative path, not an absolute one" with an example.
- environment context (C): inject the absolute project root and an explicit
"pass paths RELATIVE to the project root, do not construct absolute paths"
instruction into every agent prompt that has a path-bearing tool, so models
stop guessing the home directory in the first place.
Adds handlers.test.ts (9 sandboxPath cases incl. re-anchoring + security) and
environment.test.ts (project-root line + relative-path guidance gating).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Follow-up to #34 (DeepSeek provider support). When running the pipeline on DeepSeek, the curator called
globwith an absolute path built from a hallucinated home directory:The real home is
/Users/bober4ik/…— the model inventedboberik. Two mistakes compounded: it passed an absolute path at all (the tool contract says relative), and it guessed the home dir wrong. The sandbox correctly denied it, but the warning is non-fatal, so the agent silently continued with empty exploration results — degrading the run without failing it. Same class of "non-Claude model guesses its environment" problem as #34's host-environment injection.Fix (three reinforcing layers)
sandboxPathre-anchoring. When an absolute path lands outside the root but still contains the root's basename (agent-bober-ide), re-anchor the suffix (src) relative to the real root and retry. This never widens the sandbox: the re-anchored path isresolve(projectRoot, suffix)and is re-validated to be inside the root, so a genuinely-foreign path (/etc/passwd) and traversal (../..) still fail closed. Exported for direct unit testing.A makes it robust to the mistake; B + C reduce the mistake.
Verification
npm run build✅ ·npm run lint✅npm test→ 1787 passed, 3 skipped, 1 failed (+14 new tests). The single failure (skill-bundles › package.json version › is 0.14.0) is a pre-existing stale assertion onmain, unrelated to this PR.handlers.test.ts(9sandboxPathcases incl. re-anchoring + security boundaries),environment.test.ts(project-root line + relative-path guidance gating).🤖 Generated with Claude Code