Skip to content

jiuwenclaw adapter: adapt to upstream rename + fix subprocess timeout hang#38

Open
lec77 wants to merge 4 commits into
SJTU-IPADS:mainfrom
lec77:jiuwenclaw-v2
Open

jiuwenclaw adapter: adapt to upstream rename + fix subprocess timeout hang#38
lec77 wants to merge 4 commits into
SJTU-IPADS:mainfrom
lec77:jiuwenclaw-v2

Conversation

@lec77

@lec77 lec77 commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Linked issue

Closes #8.

Summary

Adapt the jiuwenclaw adapter to upstream's package rename (jiuwenclawjiuwenswarm, v0.2.0) and the wire-format updates that followed, and root-cause-fix an 8.75h subprocess hang we hit during testing where a client-side timeout couldn't kill the spawned ACP child cleanly.

Changes

  • Adapt to upstream wire/package changes — rewrite the history.json parser for the current flat per-event shape; switch console script (jiuwenclaw-clijiuwenswarm-tui), module paths (jiuwenclaw.appjiuwenswarm.app, jiuwenclaw.app_clijiuwenswarm.channels.acp.app_acp), runtime dir (~/.jiuwenclaw/~/.jiuwenswarm/), and the pkill pattern. SkVM adapter name stays jiuwenclaw so CLI flags (--adapter=jiuwenclaw), config key, and ~/.skvm/proposals/jit-optimize/jiuwenclaw/... paths remain stable.
  • Fix subprocess timeout hang (shared runCommandWithEnv in hermes.ts) — spawn with detached:true and group-kill on timeout, plus a bounded pipe-drain so a surviving grandchild can't deadlock the EOF read. The prior code SIGTERM'd only the direct child; the jiuwenswarm ACP client spawns a grandchild that inherits the stderr pipe, so the EOF read never completed.
  • Stop server-side runaway on timeout (jiuwenclaw adapter) — on timedOut, restart the long-lived sidecar so a runaway session in the AgentServer can't keep burning CPU / poison the next run sharing the singleton sidecar.
  • Pass realpath of task.workDir — macOS mkdtemp returns /var/folders/... symlinks; resolving once before threading into the prompt hint and --workspace-dir keeps the hint, inputs["cwd"], and the agent's emitted absolute paths consistent.

Test plan

  • bunx tsc --noEmit
  • bun test — 103 adapter tests pass (regression suite for hermes covered too)
  • Manual verification:
# A. Timeout no-hang
bun run skvm run --task=skvm-data/tasks/weather_task_01/task.json \
  --adapter=jiuwenclaw --adapter-config=managed \
  --model=deepseek/deepseek-chat --timeout-ms=20000
# → returns in ~33s (20s timeout + ~6s sidecar restart), runStatus=timeout

# B. Bench end-to-end across pdf+weather × [no-skill,original]
bun run skvm bench \
  --tasks=pdf_task_01,pdf_task_02,weather_task_01,weather_task_02 \
  --model=deepseek/deepseek-chat \
  --adapter=jiuwenclaw --adapter-config=managed \
  --conditions=no-skill,original --concurrency=1
# → exit 0, all 8 runs complete

# C. jit-optimize end-to-end with optimizer engagement
bun run skvm jit-optimize \
  --skill=skvm-data/skills/weather --task-source=real \
  --tasks=weather_task_01,weather_task_02 \
  --target-model=deepseek/deepseek-chat --target-adapter=jiuwenclaw \
  --optimizer-model=cheap_ipads/gpt-5.5 \
  --adapter-config=managed --rounds=1 --runs-per-task=1 --max-steps=80
# → exit 0, optimizer engages and writes proposal

Notes for reviewers

  • The PR scope grew beyond issue jiuwenclaw adapter: support per-request workspace_dir + chat.usage from upstream #8 during implementation: while running skvm bench to verify the wire-format work, we hit an 8.75h subprocess hang (commits 3 + 4). Both are bug-fix-shaped under the CONTRIBUTING.md rule, so no separate issue was filed — happy to file one retroactively if preferred.
  • Two of the four commits touch shared code (runCommandWithEnv in hermes.ts) — strictly safer (kills more, doesn't hang on EOF), and the hermes regression suite passes, but worth a second look from someone familiar with hermes.
  • Companion upstream PR at openJiuwen-ai/jiuwenswarm#34 adds the protocol-side features this adapter consumes. That PR is independently mergeable; this adapter ships against a jiuwenswarm that has those features.
  • Known limitation (out of scope): an unrelated jiuwenswarm-side issue where write_file silently no-ops on some file-output tasks. To be filed upstream separately.

Checklist

lec77 added 4 commits May 31, 2026 20:42
Rebases the SkVM jiuwenclaw adapter onto the upstream wire format
landed in jiuwenclaw 0.1.10+ and the per-request workspace_dir +
chat.error.error_type extensions now upstream.

History parsing
- parseJiuwenClawHistory consumes the new flat record shape: tool_call
  / tool_result / metadata fields live at the record's top level
  (upstream stopped nesting under event_payload).
- Tokens are summed from chat.usage_metadata events; chat.usage_summary
  is ignored (redundant with the per-call sum and lacks cost). Cost
  populated from usage_metadata.total_cost when the provider client
  surfaces it.
- chat.usage (legacy) handling removed.

Per-request workspace
- Each run() passes task.workDir as --workspace-dir to app_cli; the
  patched AgentServer scopes sys_operation.work_dir for the duration
  of the prompt. A short prompt prefix nudges the model toward
  relative paths.
- AgentServer writes history.json under an internal acp_* session id;
  snapshotSessionDirs/pickNewSessionDir/historyCandidatePaths walk
  the sessions dir pre/post-run and pick the freshly-created entry.
  Workaround pending an upstream fix that surfaces the internal id.

Reentrant adapter lifecycle
- setup()/teardown() are reference-counted (setupCount). The bench
  and jit-optimize stack calls both at the orchestrator level AND
  inside runTask, and the inner pair would otherwise deadlock on the
  host-wide sidecar lock the orchestrator already owns.

Diagnostics
- diagnoseJiuwenclaw reads top-level error_type from chat.error
  records and prefixes the failure summary with [ErrorType] ...
- diagnose-failure test extended for the new field.

Tests
- New test/adapters/jiuwenclaw-parse.test.ts covers the new history
  parser end-to-end (token+cost summing, tool steps, fallback text,
  ignored noise events, malformed args resilience).

Docs
- docs/jiuwenclaw.md rewritten to reflect upstream-merged state:
  required-patch warning + token/cost-not-reported caveat removed,
  new sections for per-request workspace, token/cost/error reporting,
  and an honest "Known limitations" list (history under internal id,
  agent may bypass work_dir, subagents inherit static workspace,
  unary path no error_type, macOS teardown orphan sweep).

End-to-end verified: skvm run / skvm bench (file-check evaluator,
score 0.90 PASS) / skvm jit-optimize (round-0 0.884 → round-1 1.000)
all green against deepseek-chat target on the patched jiuwenclaw.
…iuwenswarm)

Upstream renamed the Python package from `jiuwenclaw` to `jiuwenswarm`
(v0.2.0, May 2026) and restructured the module tree. The GitHub repo
URL is unchanged; only the package and on-disk paths moved.

What changed on the wire:

- Console script: `jiuwenclaw-cli` -> `jiuwenswarm-tui`
  (entry: `jiuwenswarm.channels.acp.app_acp:main`)
- Orchestrator: `python -m jiuwenclaw.app` -> `python -m jiuwenswarm.app`
- ACP stdio bridge: `python -m jiuwenclaw.app_cli` ->
  `python -m jiuwenswarm.channels.acp.app_acp`
- Source-checkout entry file: `jiuwenclaw/app_cli.py` ->
  `jiuwenswarm/channels/acp/app_acp.py`
- Runtime dir: `~/.jiuwenclaw/` -> `~/.jiuwenswarm/`
  (config/.env, agent/sessions, lock file)
- pkill pattern: `jiuwenclaw\.app` -> `jiuwenswarm\.app`

What stays the same:

- SkVM adapter name (`--harness=jiuwenclaw`, the config key
  `adapters.jiuwenclaw`, `~/.skvm/proposals/jit-optimize/jiuwenclaw/...`).
  Keeping the SkVM-facing identifier stable preserves user CLI flags,
  cached proposals, and bench result paths through the upstream rename.
- ACP wire protocol, `--workspace-dir` flag, env-var keys
  (`API_BASE`/`API_KEY`/`MODEL_NAME`/`MODEL_PROVIDER`),
  port 19001 sidecar lock contract.
- `history.json` schema (flat per-event shape, `chat.usage_metadata`,
  `chat.error.error_type`).

The "Per-request workspace" section in docs/jiuwenclaw.md is rewritten
to describe the new mechanism: `workspace_dir` is now threaded into
`inputs["cwd"]` and installed onto openjiuwen's `CwdState` ContextVar
via `_update_runtime_config -> _seed_runtime_cwd -> init_cwd(...)`,
replacing the previous `sys_operation._run_config.work_dir` mutation
that became dead code once upstream's CwdState design landed.
Root-cause fix for an observed 8.75h hang where a jiuwenclaw eval run that
should have timed out at 5min instead blocked indefinitely.

A. runCommandWithEnv (shared by hermes + jiuwenclaw): the timeout did
   `proc.kill()` on the direct child only, then read stdout/stderr to EOF via
   `new Response().text()`. The jiuwenswarm ACP client (app_acp) forks a
   gateway-stdio grandchild with `stderr=sys.stderr`, so the grandchild
   inherits our stderr pipe FD. SIGTERM bypasses the parent's Python `finally`,
   orphaning the grandchild, which keeps the pipe open — so the EOF read never
   completes and Promise.all hangs forever. Fix:
   - spawn with `detached: true` (own process group) and group-kill on timeout
     via `process.kill(-pid, SIGKILL)` so the whole tree dies and pipes close;
   - bounded pipe drain: once the process exits, give pipes a short grace then
     abandon the readers, so a stray FD holder can never deadlock us on EOF.

B. jiuwenclaw adapter: A group-kills the *client* tree, but the long-lived
   sidecar's AgentServer is a separate process tree and keeps running the
   session — a degenerate run (observed: 1200+ tool calls looping on memory
   lookups, with no server-side iteration cap because task-loop mode forces
   max_iterations=sys.maxsize) keeps burning API quota and can poison the next
   run sharing the sidecar. On timeout, restart the sidecar (stopSidecar already
   reaps the whole tree) to abort the server-side session cleanly.

Verified: a repro (parent + grandchild holding stderr, 2s timeout) returns in
~2s with timedOut=true instead of hanging; grandchild reaped; hermes tests pass.
…ace-dir

macOS mkdtemp returns paths under the /var/folders symlink (/var →
/private/var). jiuwenswarm resolves --workspace-dir through Path.resolve(),
so its fs_operation sandbox checks writes against the *realpath*
(/private/var/...). When we hand the agent the raw /var/... path in the
prompt hint and --workspace-dir, three references can drift:

  - the hint's "Your working directory is /var/..."
  - the inputs["cwd"] / inputs["workspace_dir"] threaded into CwdState
  - the agent's emitted absolute paths

For absolute writes there is a real risk the unresolved /var/... form
fails some sandbox path comparison while relative writes (which resolve
against an already-canonical cwd) succeed. openjiuwen's _resolve_path
does resolve symlinks itself for the membership check, but keeping all
three references consistent removes a class of /var-vs-/private/var
confusion and makes the hint accurately reflect what get_cwd() returns
inside the agent.

Also propagates the canonical path to:
  - runCommandWithEnv cwd (matches what the subprocess sees)
  - parseJiuwenClawHistory / buildMinimalResult workDir args
    (RunResult.workDir is now the canonical path — same directory via
    the symlink, but no longer ambiguous for downstream evaluators)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

jiuwenclaw adapter: support per-request workspace_dir + chat.usage from upstream

1 participant