Bug: ClaudeCliBackend hardcodes --bare, which breaks subscription-token auth → silent 0-score
Repo: microsoft/SkillOpt
File: skillopt_sleep/backend.py — ClaudeCliBackend._call (and the attempt_with_tools variant)
Severity: High for any user whose claude CLI is authenticated by a logged-in subscription token rather than ANTHROPIC_API_KEY. The failure is silent: the cycle reports EXIT=0 but every model call returns an error string that scores 0.
Symptom
Running the Claude backend on a machine with no ANTHROPIC_API_KEY (auth via claude subscription login):
python -m skillopt_sleep dry-run --backend claude --scope invoked
=> n_tasks: 40, baseline 0.0 -> candidate 0.0, gate reject, edits: []
The llm_miner produces 0 checkable tasks and reflect proposes nothing — not because the sessions lack signal, but because every claude -p subprocess returns Not logged in · Please run /login, which the JSON parser turns into an empty/zero result.
Root cause
_call builds the command with --bare:
cmd = [
self.claude_path, "-p", "--output-format", "text",
"--bare", # <-- this
"--disable-slash-commands",
"--disallowedTools", "*",
"--exclude-dynamic-system-prompt-sections",
]
--bare skips plugin/hook/LSP init and the credential-resolution path that loads the logged-in subscription token. With an ANTHROPIC_API_KEY in env it still authenticates; with subscription-token auth it does not.
Reproduction (subscription-auth machine, no API key)
# authenticated
claude -p --output-format text -- "say pong"
# -> pong
# with --bare (what the backend uses)
claude -p --output-format text --bare --disable-slash-commands \
--disallowedTools '*' -- "say pong"
# -> Not logged in · Please run /login
# without --bare, other isolation flags kept
claude -p --output-format text --disable-slash-commands \
--disallowedTools '*' --exclude-dynamic-system-prompt-sections -- "say pong"
# -> pong <-- auth + isolation both intact
Confirmed end-to-end: after dropping --bare, the same dry-run mined 8 checkable
tasks from real transcripts and reflect proposed 8 concrete edits (gate then
correctly rejected them because the baseline already scored 1.0 — a separate,
expected behavior, not this bug).
Fix (one line, conditional)
--bare is only safe when auth comes from ANTHROPIC_API_KEY. Gate it:
cmd = [self.claude_path, "-p", "--output-format", "text"]
if os.environ.get("ANTHROPIC_API_KEY"):
cmd.append("--bare") # safe with API-key auth; strips subscription token otherwise
cmd += [
"--disable-slash-commands",
"--disallowedTools", "*",
"--exclude-dynamic-system-prompt-sections",
]
The remaining flags (--disable-slash-commands, --disallowedTools '*',
--exclude-dynamic-system-prompt-sections, clean temp cwd) already provide the
isolation --bare was added for: no global skills, no tool use, no per-machine
system-prompt sections, no project CLAUDE.md. Dropping --bare for
subscription auth loses only the plugin/hook/LSP skip (a minor warm-up cost),
not isolation.
Apply the same change to attempt_with_tools (which also hardcodes --bare).
Suggested hardening (separate)
A model call that returns a non-JSON error string ("Not logged in …") is silently
scored 0. Consider detecting known CLI error prefixes in _call and raising/logging
instead of returning them as content, so an auth failure surfaces loudly rather
than deflating every baseline to 0.
Bug:
ClaudeCliBackendhardcodes--bare, which breaks subscription-token auth → silent 0-scoreRepo: microsoft/SkillOpt
File:
skillopt_sleep/backend.py—ClaudeCliBackend._call(and theattempt_with_toolsvariant)Severity: High for any user whose
claudeCLI is authenticated by a logged-in subscription token rather thanANTHROPIC_API_KEY. The failure is silent: the cycle reportsEXIT=0but every model call returns an error string that scores 0.Symptom
Running the Claude backend on a machine with no
ANTHROPIC_API_KEY(auth viaclaudesubscription login):The llm_miner produces 0 checkable tasks and reflect proposes nothing — not because the sessions lack signal, but because every
claude -psubprocess returnsNot logged in · Please run /login, which the JSON parser turns into an empty/zero result.Root cause
_callbuilds the command with--bare:--bareskips plugin/hook/LSP init and the credential-resolution path that loads the logged-in subscription token. With anANTHROPIC_API_KEYin env it still authenticates; with subscription-token auth it does not.Reproduction (subscription-auth machine, no API key)
Confirmed end-to-end: after dropping
--bare, the same dry-run mined 8 checkabletasks from real transcripts and reflect proposed 8 concrete edits (gate then
correctly rejected them because the baseline already scored 1.0 — a separate,
expected behavior, not this bug).
Fix (one line, conditional)
--bareis only safe when auth comes fromANTHROPIC_API_KEY. Gate it:The remaining flags (
--disable-slash-commands,--disallowedTools '*',--exclude-dynamic-system-prompt-sections, clean tempcwd) already provide theisolation
--barewas added for: no global skills, no tool use, no per-machinesystem-prompt sections, no project
CLAUDE.md. Dropping--bareforsubscription auth loses only the plugin/hook/LSP skip (a minor warm-up cost),
not isolation.
Apply the same change to
attempt_with_tools(which also hardcodes--bare).Suggested hardening (separate)
A model call that returns a non-JSON error string ("Not logged in …") is silently
scored 0. Consider detecting known CLI error prefixes in
_calland raising/logginginstead of returning them as content, so an auth failure surfaces loudly rather
than deflating every baseline to 0.