Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
192 changes: 192 additions & 0 deletions .project/chat-bridge-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# Chat Bridge Plan

## Problem

CodeCome currently launches OpenCode through `opencode run --format json` and renders its event stream in `tools/run-agent.py`.

The current Textual chat prototype has two blockers:

1. `opencode run --port` does not expose a usable HTTP server for the non-attach `run` path, so direct HTTP `POST /session/{id}/message` fails with `Connection refused`.
2. Falling back to launching a fresh `opencode run` for every chat message would make the chat path too slow.

There are also two UI issues:

1. The initial `Starting interactive chat harness` message appears too late because chat startup currently blocks on model-resolution and probe work before printing it.
2. `Ctrl+C` should open a confirmation modal instead of silently failing or requiring the command palette.

## Findings

Upstream `opencode` source confirms that plain `opencode run` does not start a network listener in the non-attach path.

In `packages/opencode/src/cli/cmd/run.ts`, the non-attach execution path builds an SDK client with:

- `baseUrl: "http://opencode.internal"`
- a custom in-process `fetch` that calls `Server.Default().app.fetch(request)`

This means:

- the normal `run` path talks to OpenCode in-process,
- `args.port` is not consumed there,
- the HTTP routes like `/session/{sessionID}/message` and `/tui/append-prompt` exist on the server HTTP API, but are not exposed by the plain `run` path.

The upstream plugin API exposes:

- `client`
- `serverUrl`
- hooks such as `chat.message`

The SDK client supports low-latency session prompting through `client.session.prompt(...)`.

## Solution

Implement a local plugin-backed chat bridge that keeps the existing `opencode run --format json` launch model, but gives the Textual UI a low-latency way to inject new user prompts into the active session.

### Bridge Architecture

1. Add a new local OpenCode plugin under `.opencode/plugins/`.
2. When loaded, the plugin starts a tiny localhost bridge server bound to `127.0.0.1` on a random port.
3. The plugin generates a random auth token.
4. The plugin emits a JSON line to stdout announcing readiness, for example:
- `type: "chat.bridge.ready"`
- `properties.port`
- `properties.token`
5. `tools/run-agent.py` captures that event and stores the bridge connection info.
6. The Textual chat input sends messages to that bridge over localhost HTTP.
7. The plugin receives the request and calls `client.session.prompt(...)` against the active session.
8. OpenCode continues emitting its normal JSON event stream to stdout, so the existing renderer path remains the source of truth for the upper panel.

This avoids:

- switching the main launcher to `opencode serve`,
- spawning a full extra `opencode run` per chat message,
- adding polling hacks or a second event protocol.

### Session Handling

Support only one active session at a time.

The bridge should maintain a single active `sessionID`, learned from `run-agent.py` as soon as the main JSON stream exposes it.

Recommended behavior:

1. `run-agent.py` learns the current `sessionID` from streamed events.
2. `run-agent.py` sends that `sessionID` to the plugin bridge once it is known, or includes it in the first `/message` request.
3. The plugin stores it as the only accepted active session.
4. Any attempt to prompt a different session should fail fast.

This keeps the bridge state simple and matches the current Textual UI model.

### Transport

Use localhost HTTP on `127.0.0.1` with a random token.

Reasoning:

- simpler to implement than Unix sockets,
- easy for Python `urllib` or `http.client`,
- acceptable for now when bound to loopback and protected by a random token.

Suggested request:

- `POST /message`
- header: `Authorization: Bearer <token>`
- body:
- `text`

Suggested optional request:

- `POST /session`
- header: `Authorization: Bearer <token>`
- body:
- `sessionID`

Suggested response:

- `{"ok": true}` or structured error JSON

### Plugin Responsibilities

The plugin should:

1. Start the bridge server at initialization.
2. Emit `chat.bridge.ready` once listening.
3. Accept authenticated POST requests.
4. Track exactly one active session.
5. Call `client.session.prompt({ path: { id: sessionID }, body: { parts: [{ type: "text", text }] } })`.
6. Return success or failure quickly.
7. Close the bridge server during shutdown if possible.

If bridge submission fails, the plugin should emit a stdout event such as `chat.bridge.error` with a human-readable message so `run-agent.py` can surface it in the upper panel.

## Textual UI Changes

### Startup Feedback

Move the `Starting interactive chat harness` message to immediately after console creation and before model-resolution and runtime-probe work.

This ensures the user sees feedback instantly on `make chat`.

### Ctrl+C Confirm Modal

Override `ctrl+c` in the Textual app.

Add a `ModalScreen` with:

- message: `Are you sure you want to quit?`
- buttons:
- `Quit`
- `Cancel`

If confirmed:

- terminate the main `opencode` process group,
- exit the TUI cleanly.

### Layout

Keep the current fix that removes bottom docking from the chat input so the footer does not overlap it.

## `run-agent.py` Integration Plan

1. Extend chat-mode startup to wait for `chat.bridge.ready`.
2. Store:
- `bridge_port`
- `bridge_token`
3. Track one active `sessionID` from the main JSON stream.
4. On chat submit:
- reject submission if bridge is not ready,
- reject submission if active `sessionID` is not known yet,
- POST to the local bridge with the message text,
- do not spawn a separate `opencode run`.
5. Keep all upper-panel rendering driven exclusively by the original JSON stdout stream.
6. Render bridge failures in the upper panel until a better UX exists.
7. Add quit-confirm modal and process cleanup.

## Suggested New Files

- `.opencode/plugins/chat-bridge.ts`
- `.project/chat-bridge-plan.md`

## Validation Plan

1. Run `make chat`.
2. Confirm the startup message appears immediately.
3. Confirm the TUI opens with no footer/input overlap.
4. Confirm the plugin emits `chat.bridge.ready`.
5. Confirm the bridge learns exactly one active session.
6. Type a prompt in the lower panel.
7. Confirm:
- no `Connection refused`,
- no extra `opencode run` spawn,
- low-latency model response,
- upper panel receives standard JSON-rendered output.
8. Trigger a bridge failure and confirm it appears in the upper panel.
9. Press `Ctrl+C`.
10. Confirm the quit modal appears.
11. Confirm quitting tears down the process cleanly.

## Decisions

1. Support only one active session at a time.
2. Show bridge failures on the upper panel until a better UX exists.
3. Use localhost transport for now.
36 changes: 36 additions & 0 deletions .project/e2e-testing-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# E2E Testing Plan with Docker & aimock

## 1. Provider Configuration (`opencode.json`)
Add the local Docker mock server to `opencode.json`.
```json
"provider": {
"aimock": {
"type": "openai",
"baseURL": "http://127.0.0.1:4010/v1",
"apiKey": "mocked-key"
}
}
```

## 2. Makefile Fix (`CODECOME_USE_WRAPPER=0` bug)
Fix the Makefile so that `OPENCODE_ARGS` are passed down when the wrapper is bypassed.
```makefile
# Before
opencode run --agent recon "$$(cat prompts/phase-1-recon.md)";
# After
opencode run $$OPENCODE_ARGS --agent recon "$$(cat prompts/phase-1-recon.md)";
```

## 3. Makefile E2E Targets
Add targets to orchestrate the mock server and test executions:

* `e2e-server-start`: Runs `aimock` in standard replay mode using the CopilotKit Docker image.
* `e2e-server-stop`: Stops and removes the `aimock` container.
* `e2e-record`: Starts `aimock` in record mode, pointing to a configurable upstream (default OpenRouter), runs the target phases forcing JSON output, and saves the baseline.
* `test-e2e`: Resets the workspace, starts `aimock` in replay mode, and executes the Python verification script.

## 4. Verification Script (`tools/test-e2e.py`)
Creates a script that:
* Invokes the test run via `CODECOME_USE_WRAPPER=0 OPENCODE_ARGS="--format json" CODECOME_MODEL=aimock/$(MODEL) make phase-X`.
* Captures live stdout (JSON sequence) and compares the agent events (`agent_message`, `tool_call`, `tool_response`) with the recorded baseline.
* Asserts file artifacts (`itemdb/notes/*.md`, `itemdb/findings/**/*.md`) match the deterministic outputs exactly.
55 changes: 49 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ venv-check:
phase-1: venv-check
@$(PYTHON) tools/gate-check.py 1
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent recon "$$(cat prompts/phase-1-recon.md)"; \
opencode run $$OPENCODE_ARGS --agent recon "$$(cat prompts/phase-1-recon.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 1 --label "Target Reconnaissance + Sandbox Bootstrap" --agent recon --prompt-file prompts/phase-1-recon.md; \
fi
Expand All @@ -145,15 +145,15 @@ phase-2: venv-check
printf "Or override (not recommended): CODECOME_ALLOW_NO_SANDBOX=1 make phase-2\n\n" ; \
exit 1 )
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent auditor "$$(cat prompts/phase-2-audit.md)"; \
opencode run $$OPENCODE_ARGS --agent auditor "$$(cat prompts/phase-2-audit.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 2 --label "Hypothesis Generation" --agent auditor --prompt-file prompts/phase-2-audit.md; \
fi

phase-3: venv-check
@$(PYTHON) tools/gate-check.py 3
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent reviewer "$$(cat prompts/phase-3-review.md)"; \
opencode run $$OPENCODE_ARGS --agent reviewer "$$(cat prompts/phase-3-review.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 3 --label "Counter-analysis" --agent reviewer --prompt-file prompts/phase-3-review.md; \
fi
Expand All @@ -162,7 +162,7 @@ phase-4: venv-check
@test -n "$(FINDING)" || (printf "\n$(BOLD)$(RED)[FAIL]$(RESET) Missing required FINDING argument for Phase 4 (Validation).\n\nSpecify which finding you want to validate:\n\n $(BOLD)make phase-4 FINDING=CC-0001$(RESET)\n\nTo list available pending findings: $(BOLD)make findings STATUS=PENDING$(RESET)\n\n" && exit 1)
@$(PYTHON) tools/gate-check.py 4 $(FINDING)
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent validator "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-4-validate.md)"; \
opencode run $$OPENCODE_ARGS --agent validator "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-4-validate.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 4 --label "Validation" --agent validator --prompt-file prompts/phase-4-validate.md --finding "$(FINDING)"; \
fi
Expand All @@ -171,15 +171,15 @@ phase-5: venv-check
@test -n "$(FINDING)" || (printf "\n$(BOLD)$(RED)[FAIL]$(RESET) Missing required FINDING argument for Phase 5 (Exploitation).\n\nSpecify which finding you want to exploit:\n\n $(BOLD)make phase-5 FINDING=CC-0001$(RESET)\n\nTo list available confirmed findings: $(BOLD)make findings STATUS=CONFIRMED$(RESET)\n\n" && exit 1)
@$(PYTHON) tools/gate-check.py 5 $(FINDING)
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent exploiter "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-5-exploit.md)"; \
opencode run $$OPENCODE_ARGS --agent exploiter "$$(sed 's#FINDING_PATH_OR_ID#$(FINDING)#g' prompts/phase-5-exploit.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 5 --label "Exploit Development" --agent exploiter --prompt-file prompts/phase-5-exploit.md --finding "$(FINDING)"; \
fi

phase-6: venv-check
@$(PYTHON) tools/gate-check.py 6
@if [ "$$CODECOME_USE_WRAPPER" = "0" ]; then \
opencode run --agent reporter "$$(cat prompts/phase-6-report.md)"; \
opencode run $$OPENCODE_ARGS --agent reporter "$$(cat prompts/phase-6-report.md)"; \
else \
$(PYTHON) tools/run-agent.py --phase 6 --label "Reporting" --agent reporter --prompt-file prompts/phase-6-report.md; \
fi
Expand Down Expand Up @@ -375,3 +375,46 @@ sandbox-status: venv-check
# make show-model AGENT=auditor
show-model: venv-check
@$(PYTHON) tools/run-agent.py --show-model --agent $(or $(AGENT),recon)

# ---------------------------------------------------------------------------
# E2E Mocking & Testing
# ---------------------------------------------------------------------------

.PHONY: e2e-server-start e2e-server-stop e2e-record test-e2e

AIMOCK_PORT ?= 4010
AIMOCK_API_KEY ?=
AIMOCK_CONTAINER ?= codecome-aimock-server
AIMOCK_FIXTURES := $(CURDIR)/tests/fixtures/llm-mocks
AIMOCK_MODEL ?= minimax/minimax-m2.5:free
AIMOCK_UPSTREAM_URL ?= https://openrouter.ai/api

e2e-server-start:
@echo "Starting aimock container..."
@mkdir -p "$(AIMOCK_FIXTURES)" tmp
@docker run -d --name "$(AIMOCK_CONTAINER)" -p $(AIMOCK_PORT):4010 -v "$(AIMOCK_FIXTURES):/fixtures" ghcr.io/copilotkit/aimock -f /fixtures -h 0.0.0.0 > /dev/null
@sleep 2

e2e-server-stop:
@echo "Stopping aimock container..."
@docker stop "$(AIMOCK_CONTAINER)" >/dev/null 2>&1 || true
@docker rm "$(AIMOCK_CONTAINER)" >/dev/null 2>&1 || true

e2e-record: e2e-server-stop
@test -n "$(AIMOCK_API_KEY)" || (echo "Please set AIMOCK_API_KEY (your OpenRouter key) to run recording" && exit 1)
@echo "Starting aimock in RECORD mode against $(AIMOCK_UPSTREAM_URL)..."
@mkdir -p "$(AIMOCK_FIXTURES)" tests/fixtures/recordings tmp
@docker run -d --name "$(AIMOCK_CONTAINER)" \
-p $(AIMOCK_PORT):4010 \
-v "$(AIMOCK_FIXTURES):/fixtures" \
ghcr.io/copilotkit/aimock \
--log-level debug \
--record --provider-openai $(AIMOCK_UPSTREAM_URL) -f /fixtures -p 4010 -h 0.0.0.0
@sleep 2
@echo "Running Phase 1 and dumping raw JSON to tests/fixtures/recordings/phase-1.json..."
@CODECOME_MODEL="aimock/$(AIMOCK_MODEL)" CODECOME_USE_WRAPPER=0 OPENCODE_ARGS="--format json -m $(CODECOME_MODEL)" $(MAKE) phase-1 > tests/fixtures/recordings/phase-1.json
@echo "Recording finished."
@$(MAKE) e2e-server-stop

e2e-test: venv-check
@AIMOCK_MODEL=$(AIMOCK_MODEL) $(PYTHON) tools/test-e2e.py
19 changes: 19 additions & 0 deletions opencode.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,24 @@
"sandbox/.env": "allow",
"*/sandbox/.env": "allow"
}
},
"provider": {
"aimock": {
"type": "openai",
"options": {
"baseURL": "http://127.0.0.1:4010/v1",
"apiKey": "{env:AIMOCK_API_KEY}"
},
"models": {
"minimax/minimax-m2.5": {},
"minimax/minimax-m2.5:free": {}
}
}
},
"agent": {
"test": {
"temperature": 0,
"top_p": 1
}
}
}
Loading
Loading