pruiz · pruiz · May 18, 2026
diff --git a/.project/chat-bridge-plan.md b/.project/chat-bridge-plan.md
@@ -0,0 +1,192 @@
+# Chat Bridge Plan
+
+## Problem
+
+CodeCome currently launches OpenCode through `opencode run --format json` and renders its event stream in `tools/run-agent.py`.
+
+The current Textual chat prototype has two blockers:
+
+1. `opencode run --port` does not expose a usable HTTP server for the non-attach `run` path, so direct HTTP `POST /session/{id}/message` fails with `Connection refused`.
+2. Falling back to launching a fresh `opencode run` for every chat message would make the chat path too slow.
+
+There are also two UI issues:
+
+1. The initial `Starting interactive chat harness` message appears too late because chat startup currently blocks on model-resolution and probe work before printing it.
+2. `Ctrl+C` should open a confirmation modal instead of silently failing or requiring the command palette.
+
+## Findings
+
+Upstream `opencode` source confirms that plain `opencode run` does not start a network listener in the non-attach path.
+
+In `packages/opencode/src/cli/cmd/run.ts`, the non-attach execution path builds an SDK client with:
+
+- `baseUrl: "http://opencode.internal"`
+- a custom in-process `fetch` that calls `Server.Default().app.fetch(request)`
+
+This means:
+
+- the normal `run` path talks to OpenCode in-process,
+- `args.port` is not consumed there,
+- the HTTP routes like `/session/{sessionID}/message` and `/tui/append-prompt` exist on the server HTTP API, but are not exposed by the plain `run` path.
+
+The upstream plugin API exposes:
+
+- `client`
+- `serverUrl`
+- hooks such as `chat.message`
+
+The SDK client supports low-latency session prompting through `client.session.prompt(...)`.
+
+## Solution
+
+Implement a local plugin-backed chat bridge that keeps the existing `opencode run --format json` launch model, but gives the Textual UI a low-latency way to inject new user prompts into the active session.
+
+### Bridge Architecture
+
+1. Add a new local OpenCode plugin under `.opencode/plugins/`.
+2. When loaded, the plugin starts a tiny localhost bridge server bound to `127.0.0.1` on a random port.
+3. The plugin generates a random auth token.
+4. The plugin emits a JSON line to stdout announcing readiness, for example:
+   - `type: "chat.bridge.ready"`
+   - `properties.port`
+   - `properties.token`
+5. `tools/run-agent.py` captures that event and stores the bridge connection info.
+6. The Textual chat input sends messages to that bridge over localhost HTTP.
+7. The plugin receives the request and calls `client.session.prompt(...)` against the active session.
+8. OpenCode continues emitting its normal JSON event stream to stdout, so the existing renderer path remains the source of truth for the upper panel.
+
+This avoids:
+
+- switching the main launcher to `opencode serve`,
+- spawning a full extra `opencode run` per chat message,
+- adding polling hacks or a second event protocol.
+
+### Session Handling
+
+Support only one active session at a time.
+
+The bridge should maintain a single active `sessionID`, learned from `run-agent.py` as soon as the main JSON stream exposes it.
+
+Recommended behavior:
+
+1. `run-agent.py` learns the current `sessionID` from streamed events.
+2. `run-agent.py` sends that `sessionID` to the plugin bridge once it is known, or includes it in the first `/message` request.
+3. The plugin stores it as the only accepted active session.
+4. Any attempt to prompt a different session should fail fast.
+
+This keeps the bridge state simple and matches the current Textual UI model.
+
+### Transport
+
+Use localhost HTTP on `127.0.0.1` with a random token.
+
+Reasoning:
+
+- simpler to implement than Unix sockets,
+- easy for Python `urllib` or `http.client`,
+- acceptable for now when bound to loopback and protected by a random token.
+
+Suggested request:
+
+- `POST /message`
+- header: `Authorization: Bearer <token>`
+- body:
+  - `text`
+
+Suggested optional request:
+
+- `POST /session`
+- header: `Authorization: Bearer <token>`
+- body:
+  - `sessionID`
+
+Suggested response:
+
+- `{"ok": true}` or structured error JSON
+
+### Plugin Responsibilities
+
+The plugin should:
+
+1. Start the bridge server at initialization.
+2. Emit `chat.bridge.ready` once listening.
+3. Accept authenticated POST requests.
+4. Track exactly one active session.
+5. Call `client.session.prompt({ path: { id: sessionID }, body: { parts: [{ type: "text", text }] } })`.
+6. Return success or failure quickly.
+7. Close the bridge server during shutdown if possible.
+
+If bridge submission fails, the plugin should emit a stdout event such as `chat.bridge.error` with a human-readable message so `run-agent.py` can surface it in the upper panel.
+
+## Textual UI Changes
+
+### Startup Feedback
+
+Move the `Starting interactive chat harness` message to immediately after console creation and before model-resolution and runtime-probe work.
+
+This ensures the user sees feedback instantly on `make chat`.
+
+### Ctrl+C Confirm Modal
+
+Override `ctrl+c` in the Textual app.
+
+Add a `ModalScreen` with:
+
+- message: `Are you sure you want to quit?`
+- buttons:
+  - `Quit`
+  - `Cancel`
+
+If confirmed:
+
+- terminate the main `opencode` process group,
+- exit the TUI cleanly.
+
+### Layout
+
+Keep the current fix that removes bottom docking from the chat input so the footer does not overlap it.
+
+## `run-agent.py` Integration Plan
+
+1. Extend chat-mode startup to wait for `chat.bridge.ready`.
+2. Store:
+   - `bridge_port`
+   - `bridge_token`
+3. Track one active `sessionID` from the main JSON stream.
+4. On chat submit:
+   - reject submission if bridge is not ready,
+   - reject submission if active `sessionID` is not known yet,
+   - POST to the local bridge with the message text,
+   - do not spawn a separate `opencode run`.
+5. Keep all upper-panel rendering driven exclusively by the original JSON stdout stream.
+6. Render bridge failures in the upper panel until a better UX exists.
+7. Add quit-confirm modal and process cleanup.
+
+## Suggested New Files
+
+- `.opencode/plugins/chat-bridge.ts`
+- `.project/chat-bridge-plan.md`
+
+## Validation Plan
+
+1. Run `make chat`.
+2. Confirm the startup message appears immediately.
+3. Confirm the TUI opens with no footer/input overlap.
+4. Confirm the plugin emits `chat.bridge.ready`.
+5. Confirm the bridge learns exactly one active session.
+6. Type a prompt in the lower panel.
+7. Confirm:
+   - no `Connection refused`,
+   - no extra `opencode run` spawn,
+   - low-latency model response,
+   - upper panel receives standard JSON-rendered output.
+8. Trigger a bridge failure and confirm it appears in the upper panel.
+9. Press `Ctrl+C`.
+10. Confirm the quit modal appears.
+11. Confirm quitting tears down the process cleanly.
+
+## Decisions
+
+1. Support only one active session at a time.
+2. Show bridge failures on the upper panel until a better UX exists.
+3. Use localhost transport for now.
diff --git a/.project/e2e-testing-plan.md b/.project/e2e-testing-plan.md
@@ -0,0 +1,36 @@
+# E2E Testing Plan with Docker & aimock
+
+## 1. Provider Configuration (`opencode.json`)
+Add the local Docker mock server to `opencode.json`.
+```json
+  "provider": {
+    "aimock": {
+      "type": "openai",
+      "baseURL": "http://127.0.0.1:4010/v1",
+      "apiKey": "mocked-key"
+    }
+  }
+```
+
+## 2. Makefile Fix (`CODECOME_USE_WRAPPER=0` bug)
+Fix the Makefile so that `OPENCODE_ARGS` are passed down when the wrapper is bypassed.
+```makefile
+# Before
+opencode run --agent recon "$$(cat prompts/phase-1-recon.md)";
+# After
+opencode run $$OPENCODE_ARGS --agent recon "$$(cat prompts/phase-1-recon.md)";
+```
+
+## 3. Makefile E2E Targets
+Add targets to orchestrate the mock server and test executions:
+
+*   `e2e-server-start`: Runs `aimock` in standard replay mode using the CopilotKit Docker image.
+*   `e2e-server-stop`: Stops and removes the `aimock` container.
+*   `e2e-record`: Starts `aimock` in record mode, pointing to a configurable upstream (default OpenRouter), runs the target phases forcing JSON output, and saves the baseline.
+*   `test-e2e`: Resets the workspace, starts `aimock` in replay mode, and executes the Python verification script.
+
+## 4. Verification Script (`tools/test-e2e.py`)
+Creates a script that:
+*   Invokes the test run via `CODECOME_USE_WRAPPER=0 OPENCODE_ARGS="--format json" CODECOME_MODEL=aimock/$(MODEL) make phase-X`.
+*   Captures live stdout (JSON sequence) and compares the agent events (`agent_message`, `tool_call`, `tool_response`) with the recorded baseline.
+*   Asserts file artifacts (`itemdb/notes/*.md`, `itemdb/findings/**/*.md`) match the deterministic outputs exactly.
diff --git a/Makefile b/Makefile
@@ -132,7 +132,7 @@ venv-check:
 phase-1: venv-check
 	@$(PYTHON) tools/gate-check.py 1
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent recon "$$(cat prompts/phase-1-recon.md)"; \
+		opencode run $$OPENCODE_ARGS --agent recon "$$(cat prompts/phase-1-recon.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 1 --label "Target Reconnaissance + Sandbox Bootstrap" --agent recon --prompt-file prompts/phase-1-recon.md; \
 	fi
@@ -145,15 +145,15 @@ phase-2: venv-check
 		printf "Or override (not recommended): CODECOME_ALLOW_NO_SANDBOX=1 make phase-2\n\n" ; \
 		exit 1 )
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent auditor "$$(cat prompts/phase-2-audit.md)"; \
+		opencode run $$OPENCODE_ARGS --agent auditor "$$(cat prompts/phase-2-audit.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 2 --label "Hypothesis Generation" --agent auditor --prompt-file prompts/phase-2-audit.md; \
 	fi
 
 phase-3: venv-check
 	@$(PYTHON) tools/gate-check.py 3
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent reviewer "$$(cat prompts/phase-3-review.md)"; \
+		opencode run $$OPENCODE_ARGS --agent reviewer "$$(cat prompts/phase-3-review.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 3 --label "Counter-analysis" --agent reviewer --prompt-file prompts/phase-3-review.md; \
 	fi
@@ -162,7 +162,7 @@ phase-4: venv-check
 	@test -n "$(FINDING)" || (printf "\n$(BOLD)$(RED)[FAIL]$(RESET) Missing required FINDING argument for Phase 4 (Validation).\n\nSpecify which finding you want to validate:\n\n    $(BOLD)make phase-4 FINDING=CC-0001$(RESET)\n\nTo list available pending findings: $(BOLD)make findings STATUS=PENDING$(RESET)\n\n" && exit 1)
 	@$(PYTHON) tools/gate-check.py 4 $(FINDING)
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent validator "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-4-validate.md)"; \
+		opencode run $$OPENCODE_ARGS --agent validator "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-4-validate.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 4 --label "Validation" --agent validator --prompt-file prompts/phase-4-validate.md --finding "$(FINDING)"; \
 	fi
@@ -171,15 +171,15 @@ phase-5: venv-check
 	@test -n "$(FINDING)" || (printf "\n$(BOLD)$(RED)[FAIL]$(RESET) Missing required FINDING argument for Phase 5 (Exploitation).\n\nSpecify which finding you want to exploit:\n\n    $(BOLD)make phase-5 FINDING=CC-0001$(RESET)\n\nTo list available confirmed findings: $(BOLD)make findings STATUS=CONFIRMED$(RESET)\n\n" && exit 1)
 	@$(PYTHON) tools/gate-check.py 5 $(FINDING)
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent exploiter "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-5-exploit.md)"; \
+		opencode run $$OPENCODE_ARGS --agent exploiter "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-5-exploit.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 5 --label "Exploit Development" --agent exploiter --prompt-file prompts/phase-5-exploit.md --finding "$(FINDING)"; \
 	fi
 
 phase-6: venv-check
 	@$(PYTHON) tools/gate-check.py 6
 	@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
-		opencode run --agent reporter "$$(cat prompts/phase-6-report.md)"; \
+		opencode run $$OPENCODE_ARGS --agent reporter "$$(cat prompts/phase-6-report.md)"; \
 	else \
 		$(PYTHON) tools/run-agent.py --phase 6 --label "Reporting" --agent reporter --prompt-file prompts/phase-6-report.md; \
 	fi
@@ -375,3 +375,46 @@ sandbox-status: venv-check
 #   make show-model AGENT=auditor
 show-model: venv-check
 	@$(PYTHON) tools/run-agent.py --show-model --agent $(or $(AGENT),recon)
+
+# ---------------------------------------------------------------------------
+# E2E Mocking & Testing
+# ---------------------------------------------------------------------------
+
+.PHONY: e2e-server-start e2e-server-stop e2e-record test-e2e
+
+AIMOCK_PORT ?= 4010
+AIMOCK_API_KEY ?=
+AIMOCK_CONTAINER ?= codecome-aimock-server
+AIMOCK_FIXTURES := $(CURDIR)/tests/fixtures/llm-mocks
+AIMOCK_MODEL ?= minimax/minimax-m2.5:free
+AIMOCK_UPSTREAM_URL ?= https://openrouter.ai/api
+
+e2e-server-start:
+	@echo "Starting aimock container..."
+	@mkdir -p "$(AIMOCK_FIXTURES)" tmp
+	@docker run -d --name "$(AIMOCK_CONTAINER)" -p $(AIMOCK_PORT):4010 -v "$(AIMOCK_FIXTURES):/fixtures" ghcr.io/copilotkit/aimock -f /fixtures -h 0.0.0.0 > /dev/null
+	@sleep 2
+
+e2e-server-stop:
+	@echo "Stopping aimock container..."
+	@docker stop "$(AIMOCK_CONTAINER)" >/dev/null 2>&1 || true
+	@docker rm "$(AIMOCK_CONTAINER)" >/dev/null 2>&1 || true
+
+e2e-record: e2e-server-stop
+	@test -n "$(AIMOCK_API_KEY)" || (echo "Please set AIMOCK_API_KEY (your OpenRouter key) to run recording" && exit 1)
+	@echo "Starting aimock in RECORD mode against $(AIMOCK_UPSTREAM_URL)..."
+	@mkdir -p "$(AIMOCK_FIXTURES)" tests/fixtures/recordings tmp
+	@docker run -d --name "$(AIMOCK_CONTAINER)" \
+		-p $(AIMOCK_PORT):4010 \
+		-v "$(AIMOCK_FIXTURES):/fixtures" \
+		ghcr.io/copilotkit/aimock \
+			--log-level debug \
+			--record --provider-openai $(AIMOCK_UPSTREAM_URL) -f /fixtures -p 4010 -h 0.0.0.0 
+	@sleep 2
+	@echo "Running Phase 1 and dumping raw JSON to tests/fixtures/recordings/phase-1.json..."
+	@CODECOME_MODEL="aimock/$(AIMOCK_MODEL)" CODECOME_USE_WRAPPER=0 OPENCODE_ARGS="--format json -m $(CODECOME_MODEL)" $(MAKE) phase-1 > tests/fixtures/recordings/phase-1.json
+	@echo "Recording finished."
+	@$(MAKE) e2e-server-stop
+
+e2e-test: venv-check
+	@AIMOCK_MODEL=$(AIMOCK_MODEL) $(PYTHON) tools/test-e2e.py
diff --git a/opencode.json b/opencode.json
@@ -16,5 +16,24 @@
       "sandbox/.env": "allow",
       "*/sandbox/.env": "allow"
     }
+  },
+  "provider": {
+    "aimock": {
+      "type": "openai",
+      "options": {
+        "baseURL": "http://127.0.0.1:4010/v1",
+        "apiKey": "{env:AIMOCK_API_KEY}"
+      },
+      "models": {
+        "minimax/minimax-m2.5": {},
+        "minimax/minimax-m2.5:free": {}
+      }
+    }
+  },
+  "agent": {
+    "test": {
+      "temperature": 0,
+      "top_p": 1
+    }
   }
 }