awslabs · marcfargas · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026
diff --git a/docs/PLAN-phase2.md b/docs/PLAN-phase2.md
diff --git a/docs/multiplexer-api-surface.md b/docs/multiplexer-api-surface.md
diff --git a/pyproject.toml b/pyproject.toml
@@ -70,11 +70,12 @@ markers = [
     "asyncio: marks tests that use asyncio",
     "integration: marks integration tests",
     "e2e: marks end-to-end tests",
-    "slow: marks tests as slow (deselect with '-m \"not slow\"')"
+    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
+    "smoke: opt-in tests that require real wezterm + provider CLIs on PATH; not run by default",
 ]
 asyncio_mode = "strict"
 testpaths = ["test"]
 python_files = "test_*.py"
 python_classes = "Test*"
 python_functions = "test_*"
-addopts = "--cov=src --cov-report=term-missing -m 'not e2e'"
+addopts = "--cov=src --cov-report=term-missing -m 'not e2e and not smoke'"
diff --git a/spikes/01-result.md b/spikes/01-result.md
@@ -0,0 +1,19 @@
+# Spike 1 Result
+
+- Verdict: **GO**
+- Summary: spawn/send-text/get-text/kill-pane all worked with a standalone WezTerm window.
+- WezTerm binary: `C:\Users\marc\Downloads\WezTerm-windows-20260331-040028-577474d8\wezterm.exe`
+- WezTerm version: `wezterm 20260331-040028-577474d8`
+- Duration: `3312 ms`
+
+## Evidence
+- `spawn` pane id: `17`
+- shell ready marker observed: `True`
+- `send-text` exit code: `0`
+- `get-text` contains marker: `True`
+```text
+SHELL_READY
+marc@mafewin:/mnt/c/Users/marc$ echo hello-from-spike
+hello-from-spike
+marc@mafewin:/mnt/c/Users/marc$
+```
diff --git a/spikes/02-result.md b/spikes/02-result.md
@@ -0,0 +1,84 @@
+# Spike 2 Result
+
+- Verdict: **NEEDS-WORKAROUND**
+- Per-CLI verdicts: `claude: neither, codex: neither, gemini: blocked`
+- Mode A: `wezterm cli send-text --no-paste -- '/help\n'`
+- Mode B: `wezterm cli send-text -- '/help\n'`
+
+## Recommendation
+- `claude`: prefer `custom workaround needed`
+- `codex`: prefer `custom workaround needed`
+
+## Evidence
+### claude
+- Status: `fail`
+- Accepted mode: `neither`
+```text
+[A --no-paste]
+ Quick safety check: Is this a project you created or one you trust? (Like your
+  own code, a well-known open source project, or work from your team). If not,
+ take a moment to review what's in this folder first.
+
+ Claude Code'll be able to read, edit, and execute files here.
+
+ Security guide
+
+ ❯ 1. Yes, I trust this folder
+   2. No, exit
+
+ Enter to confirm · Esc to cancel
+
+[B default paste]
+ Quick safety check: Is this a project you created or one you trust? (Like your
+  own code, a well-known open source project, or work from your team). If not,
+ take a moment to review what's in this folder first.
+
+ Claude Code'll be able to read, edit, and execute files here.
+
+ Security guide
+
+ ❯ 1. Yes, I trust this folder
+   2. No, exit
+
+ Enter to confirm · Esc to cancel
+```
+### codex
+- Status: `fail`
+- Accepted mode: `neither`
+```text
+[A --no-paste]
+⚠️ Process "codex" in domain "local" didn't exit cleanly
+Exited with code 1.
+This message is shown because exit_behavior="CloseOnCleanExit"
+
+[B default paste]
+⚠️ Process "codex" in domain "local" didn't exit cleanly
+Exited with code 1.
+This message is shown because exit_behavior="CloseOnCleanExit"
+```
+### gemini
+- Status: `blocked`
+- Accepted mode: `blocked`
+```text
+command not installed or not on PATH
+```
+
+## Environment Notes
+- `gemini` could not be tested because the executable is unavailable in this environment.
+
+## Gemini (re-tested after install)
+- Re-test date: `2026-04-24`
+- Status: `blocked`
+- Accepted mode: `blocked`
+```text
+[PATH checks]
+PowerShell: gemini --version
+  The term 'gemini' is not recognized as a name of a cmdlet, function, script file, or executable program.
+
+PowerShell: where.exe gemini
+  INFO: Could not find files for the given pattern(s).
+
+bash: command -v gemini
+  <no output>
+```
+- Result: the binary is still unavailable on this machine, so spike 2 remains blocked for Gemini.
diff --git a/spikes/02b-codex-launch.md b/spikes/02b-codex-launch.md
@@ -0,0 +1,85 @@
+# Spike 2b Result
+
+- Verdict: **NEEDS-WORKAROUND**
+- Goal status: `launch fixed, raw send-text submission still unresolved`
+- Working launch command:
+  ```powershell
+  & 'C:\Users\marc\Downloads\WezTerm-windows-20260331-040028-577474d8\wezterm.exe' cli spawn --new-window --cwd C:\dev\aws-cao -- C:\Users\marc\scoop\apps\nodejs-lts\current\bin\codex.cmd -c hooks=[] --yolo --no-alt-screen --disable shell_snapshot
+  ```
+- TUI-ready latency: `2319 ms avg` across `1755 / 2469 / 2734 ms`
+- Send-text verdict: `neither`
+
+## What Worked
+- The pane stayed alive and rendered the Codex TUI when launched via the Windows shim `codex.cmd`.
+- CAO's tmux flags were still necessary: `--yolo --no-alt-screen --disable shell_snapshot`.
+- A one-shot config override `-c hooks=[]` was also necessary on this machine because interactive Codex rejected the local `hooks` config schema during startup.
+
+## Why Naive Spawn Exited
+- `wezterm cli spawn --new-window -- codex` launched inside Marc's default WezTerm shell domain, which is `bash` in a Linux-style environment for this window.
+- In that shell, `codex` resolved to `/mnt/c/.../codex`, then aborted with:
+  `Error: Missing optional dependency @openai/codex-linux-arm64`
+- When forced onto the Windows Codex shim, startup progressed but interactive Codex still aborted unless `-c hooks=[]` was added, due to:
+  `invalid type: map, expected a sequence in hooks`
+
+## Send-Text Probe
+- Mode A: `wezterm cli send-text --pane-id <ID> --no-paste -- '/help\n'`
+- Mode B: `wezterm cli send-text --pane-id <ID> -- '/help\n'`
+- Result: both modes inserted text into Codex's composer, but neither mode visibly submitted the message or produced command output.
+- Fallback text prompts behaved the same way: the prompt text appeared after `›`, but Codex did not execute it within the observation window.
+
+```text
+[A --no-paste]
+› /help
+  gpt-5.4 default · C:\dev\aws-cao
+
+[B default paste]
+› /help
+  gpt-5.4 default · C:\dev\aws-cao
+```
+
+## Evidence
+### Failing naive launch in WezTerm shell domain
+```text
+file:///mnt/c/Users/marc/scoop/persist/nodejs-lts/bin/node_modules/@openai/codex/bin/codex.js:100
+Error: Missing optional dependency @openai/codex-linux-arm64. Reinstall Codex: npm install -g @openai/codex@latest
+```
+
+### Successful TUI launch with explicit Windows Codex
+```text
+╭─────────────────────────────────────────╮
+│ >_ OpenAI Codex (v0.124.0)              │
+│ model:       gpt-5.4   /model to change │
+│ directory:   C:\dev\aws-cao             │
+│ permissions: YOLO mode                  │
+╰─────────────────────────────────────────╯
+
+⚠ failed to parse hooks config C:\Users\marc\.codex\hooks.json: expected value
+⚠ failed to parse TOML hooks in C:\Users\marc\.codex\config.toml: invalid type: map, expected a sequence
+
+› Summarize recent commits
+  gpt-5.4 default · C:\dev\aws-cao
+```
+
+## WezTerm Backend Construction Diff
+```diff
+--- a/src/cli_agent_orchestrator/providers/codex.py
++++ b/src/cli_agent_orchestrator/multiplexers/wezterm.py
+@@
+- command = shlex.join(["codex", "--yolo", "--no-alt-screen", "--disable", "shell_snapshot"])
++ spawn_argv = [
++   resolve_windows_codex(),  # prefer codex.cmd on Windows; avoid bash/WSL shim resolution
++   "-c",
++   "hooks=[]",               # local interactive Codex rejected ~/.codex hooks schema on marcwin
++   "--yolo",
++   "--no-alt-screen",
++   "--disable",
++   "shell_snapshot",
++ ]
++ wezterm cli spawn --new-window --cwd <workdir> -- <spawn_argv...>
+```
+
+## Recommendation
+- For WezTerm on Windows, do not rely on shell-resolved `codex`.
+- Resolve the executable explicitly to the Windows shim (`codex.cmd`) before calling `wezterm cli spawn`.
+- Carry forward CAO's existing flags unchanged.
+- Keep a provider/backend-specific workaround slot for local Codex config overrides, because interactive startup can fail before the TUI becomes reachable.
diff --git a/spikes/03-result.md b/spikes/03-result.md
@@ -0,0 +1,39 @@
+# Spike 3 Result
+
+- Verdict: **GO**
+- Recommended interval: `500 ms`
+
+## Measurements
+
+| Interval | First detection (ms) | CPU % | Poll count | Miss count |
+|---|---:|---:|---:|---:|
+| 100 ms | 152.7 | 2.04 | 23 | 0 |
+| 200 ms | 207.3 | 3.64 | 13 | 0 |
+| 500 ms | 144.2 | 0.83 | 16 | 0 |
+
+## Raw JSON
+```json
+[
+  {
+    "interval_ms": 100,
+    "first_detection_ms": 152.7,
+    "cpu_percent": 2.04,
+    "polls": 23,
+    "miss_count": 0
+  },
+  {
+    "interval_ms": 200,
+    "first_detection_ms": 207.3,
+    "cpu_percent": 3.64,
+    "polls": 13,
+    "miss_count": 0
+  },
+  {
+    "interval_ms": 500,
+    "first_detection_ms": 144.2,
+    "cpu_percent": 0.83,
+    "polls": 16,
+    "miss_count": 0
+  }
+]
+```
diff --git a/spikes/04-result.md b/spikes/04-result.md
@@ -0,0 +1,103 @@
+# Spike 4 Result
+- Verdict: **NEEDS-WORKAROUND**
+- Summary: `claude: missing BYPASS_PROMPT_PATTERN; codex: missing IDLE_PROMPT_PATTERN, TRUST_PROMPT_PATTERN, WAITING_PROMPT_PATTERN, CODEX_WELCOME_PATTERN; gemini: blocked`
+
+
+## claude
+- Source: `src\cli_agent_orchestrator\providers\claude_code.py`
+- `IDLE_PROMPT_PATTERN` = `[>❯][\s\xa0]`
+- `TRUST_PROMPT_PATTERN` = `Yes, I trust this folder`
+- `BYPASS_PROMPT_PATTERN` = `Yes, I accept`
+- Plain capture length: `504`
+- Escaped capture length: `1037`
+
+| Pattern | Plain | `--escapes` |
+|---|---|---|
+| `IDLE_PROMPT_PATTERN` | `True` | `True` |
+| `TRUST_PROMPT_PATTERN` | `True` | `False` |
+| `BYPASS_PROMPT_PATTERN` | `False` | `False` |
+
+```text
+────────────────────────────────────────────────────────────────────────────────
+ Accessing workspace:
+
+ C:\dev\aws-cao
+
+ Quick safety check: Is this a project you created or one you trust? (Like your
+  own code, a well-known open source project, or work from your team). If not,
+ take a moment to review what's in this folder first.
+
+ Claude Code'll be able to read, edit, and execute files here.
+
+ Security guide
+
+ ❯ 1. Yes, I trust this folder
+   2. No, exit
+
+ Enter to confirm · Esc to cancel
+```
+
+## codex
+- Source: `src\cli_agent_orchestrator\providers\codex.py`
+- `IDLE_PROMPT_PATTERN` = `(?:❯|›|codex>)`
+- `TRUST_PROMPT_PATTERN` = `allow Codex to work in this folder`
+- `WAITING_PROMPT_PATTERN` = `^(?:Approve|Allow)\b.*\b(?:y/n|yes/no|yes|no)\b`
+- `CODEX_WELCOME_PATTERN` = `OpenAI Codex`
+- Plain capture length: `161`
+- Escaped capture length: `232`
+
+| Pattern | Plain | `--escapes` |
+|---|---|---|
+| `IDLE_PROMPT_PATTERN` | `False` | `False` |
+| `TRUST_PROMPT_PATTERN` | `False` | `False` |
+| `WAITING_PROMPT_PATTERN` | `False` | `False` |
+| `CODEX_WELCOME_PATTERN` | `False` | `False` |
+
+```text
+⚠️ Process "codex" in domain "local" didn't exit cleanly
+Exited with code 1.
+This message is shown because exit_behavior="CloseOnCleanExit"
+```
+
+## gemini
+- Source: `src\cli_agent_orchestrator\providers\gemini_cli.py`
+- `IDLE_PROMPT_PATTERN` = `\*\s+Type your message`
+- `WELCOME_BANNER_PATTERN` = `█████████.*██████████`
+- `RESPONDING_WITH_PATTERN` = `Responding with\s+\S+`
+- Runtime probe: blocked; `gemini` executable unavailable.
+## Candidate Regex Patch Notes
+```diff
+--- a/src/cli_agent_orchestrator/providers/claude_code.py
++++ b/src/cli_agent_orchestrator/providers/claude_code.py
+@@
+-# Existing WezTerm probe did not match: BYPASS_PROMPT_PATTERN
++# Phase 2: either normalize WezTerm startup text or broaden these regexes: BYPASS_PROMPT_PATTERN
+```
+
+```diff
+--- a/src/cli_agent_orchestrator/providers/codex.py
++++ b/src/cli_agent_orchestrator/providers/codex.py
+@@
+-# Existing WezTerm probe did not match: IDLE_PROMPT_PATTERN, TRUST_PROMPT_PATTERN, WAITING_PROMPT_PATTERN, CODEX_WELCOME_PATTERN
++# Phase 2: either normalize WezTerm startup text or broaden these regexes: IDLE_PROMPT_PATTERN, TRUST_PROMPT_PATTERN, WAITING_PROMPT_PATTERN, CODEX_WELCOME_PATTERN
+```
+
+## Gemini (re-tested after install)
+- Re-test date: `2026-04-24`
+- Source: `src\cli_agent_orchestrator\providers\gemini_cli.py`
+- `IDLE_PROMPT_PATTERN` = `\*\s+Type your message`
+- `WELCOME_BANNER_PATTERN` = `█████████.*██████████`
+- `RESPONDING_WITH_PATTERN` = `Responding with\s+\S+`
+- Runtime probe: still blocked; `gemini` is not available from PowerShell, `where.exe`, or `bash`.
+
+```text
+PowerShell: gemini --version
+  The term 'gemini' is not recognized as a name of a cmdlet, function, script file, or executable program.
+
+PowerShell: where.exe gemini
+  INFO: Could not find files for the given pattern(s).
+
+bash: command -v gemini
+  <no output>
+```
+
diff --git a/spikes/SUMMARY.md b/spikes/SUMMARY.md
@@ -0,0 +1,9 @@
+# WezTerm Phase 1 Spike Summary
+
+| # | Spike | Verdict | Key finding | Phase 2 implication |
+|---|---|---|---|---|
+| 1 | `send-text` + `get-text` round-trip | GO | `wezterm cli spawn/send-text/get-text/kill-pane` works on marcwin when using the local WezTerm binary and waiting for a shell-ready marker. | The substrate is viable; backend work can proceed. |
+| 2 | Paste-mode behavior in AI CLIs | NEEDS-WORKAROUND | Claude still stayed on its trust prompt under both paste modes; Codex now launches with an explicit Windows shim but raw `send-text` still only populates the composer; Gemini is still unavailable on this machine. | Phase 2 still needs per-provider startup and submission handling; raw `send-text` alone is not sufficient for Codex and no default paste mode can be chosen globally yet. |
+| 2b | Codex launch args | NEEDS-WORKAROUND | Codex stayed alive only when WezTerm spawned `codex.cmd -c hooks=[] --yolo --no-alt-screen --disable shell_snapshot`; naive `codex` hit the bash/WSL shim and exited, but post-launch `send-text` still did not submit. | The WezTerm backend should resolve the Windows Codex shim explicitly and preserve CAO's flags, but another mechanism is still needed to submit input after text insertion. |
+| 3 | Polling latency for `pipe_pane` replacement | GO | `wezterm cli get-text` polling saw all 10/10 burst markers at 100/200/500 ms with first-detection latencies of 152.7 ms, 207.3 ms, and 144.2 ms respectively. | Replacing `pipe-pane` with polling is feasible; start with 500 ms for lower CPU and tune if inbox responsiveness needs more aggression. |
+| 4 | `get-text` regex compatibility | NEEDS-WORKAROUND | Claude trust text matches in plain `get-text` output, but not consistently in `--escapes`; Codex startup is now understood as a shell/config issue rather than a pure regex issue; Gemini is still unavailable on this machine. | Phase 2 should normalize plain `get-text` output first, fix Codex launch path separately from regex handling, and defer Gemini-specific regex validation until the binary is actually reachable. |