Skip to content

Commit 29b5abe

Browse files
committed
merge: codex/p3-01-chat-harness-contracts
2 parents 488e26a + 0f199d8 commit 29b5abe

16 files changed

Lines changed: 819 additions & 325 deletions

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22

33
## 2026-03-15
44

5+
### Ship Phase 3 Chat Harness Vocabulary And Contracts
6+
7+
- Added a normalized `ChatHarness` contract with serialization-friendly request, result, event, failure, identity, capability, and observability types in `agents/chat_harness.py`, while keeping `BaseAgent` only as a compatibility shim.
8+
- Refactored the FastAPI startup, readiness, and send-message flow in `main.py`, `utils/diagnostics.py`, and `services/chat_turns.py` so the app layer now talks to harness-level contracts and normalized failures instead of catching OpenAI SDK exceptions directly.
9+
- Adapted the shipped OpenAI path in `agents/openai_agent.py` to expose explicit harness identity, normalized `run()` behavior, and harness-owned observability metadata without changing the current non-streaming chat behavior.
10+
- Updated contributor-facing guidance in `README.md` and `plans/PHASE 3 DESIGN.md`, moved `P3-01` out of `plans/PHASE 3 BACKLOG.md`, and recorded the shipped slice in `plans/done/PHASE 3 DONE.md`.
11+
- Verification passed with `uv run ruff check .`, `uv run mypy .`, and `uv run python -m pytest` (`178 passed`).
12+
513
### Ship Phase 2 Test And Documentation Expansion
614

715
- Added repository, service, and route regression coverage for replayed `failed` and `conflicted` requests, duplicate `processing` requests, archived target rejection, and archived mid-flight conflict handling.

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Those documents define the long-term direction and maturity phases. This README
2121
- In-flight request locking plus persisted request IDs so duplicate submissions are replayed instead of being processed twice.
2222
- Lightweight loading feedback while switching chats.
2323
- Inline failure handling for validation, service-unavailable, and transport-error states.
24-
- OpenAI-backed agent implementation (`gpt-5-mini` by default).
24+
- OpenAI-backed chat harness implementation (`gpt-5-mini` by default).
2525
- SQLite-backed chat storage with per-client chat ownership and transcript persistence across reloads and restarts.
2626
- Prompt-template-driven system and user prompt construction.
2727
- Neutral `AI Chat` defaults with no implicit domain context beyond the persisted transcript for the active chat.
@@ -43,7 +43,7 @@ Phase 2 is complete when the default app behaves as a durable, browser-cookie-sc
4343
The repository has moved well past the original Phase 1 boundary. The notes below stay here as historical context for the startup/configuration baseline that still underpins the current app:
4444

4545
- Startup path resolution is project-root-aware rather than dependent on the shell's current working directory.
46-
- Runtime behavior is configurable through environment variables instead of route-level or agent-level constants.
46+
- Runtime behavior is configurable through environment variables instead of route-level or harness-level constants.
4747
- Default CORS behavior matches the current no-auth posture: wildcard origins are allowed, but credentials stay disabled unless you opt into explicit origins.
4848
- Prompt/template selection, OpenAI model choice, timeout, and compatible temperature settings can be changed without modifying application code.
4949
- The browser chat flow prevents duplicate sends, uses a minimal typing indicator during normal requests, and renders degraded-service states inline when the backend is unavailable or a request fails.
@@ -172,9 +172,10 @@ Forks that move beyond trusted local or internal use should plan explicit securi
172172

173173
```
174174
basic_chat_app/
175-
├── agents/ # AI agent implementations
176-
│ ├── base_agent.py # Abstract base agent class
177-
│ └── openai_agent.py # OpenAI-specific agent implementation
175+
├── agents/ # Chat harness contracts and implementations
176+
│ ├── base_agent.py # Legacy compatibility shim and harness re-exports
177+
│ ├── chat_harness.py # Core ChatHarness contract and normalized types
178+
│ └── openai_agent.py # OpenAI-specific harness implementation
178179
├── persistence/ # SQLite bootstrap and chat repository code
179180
├── static/ # Static assets
180181
│ ├── css/ # CSS styles
@@ -245,10 +246,12 @@ uv run pre-commit install --hook-type pre-commit --hook-type pre-push
245246

246247
### Adding New Features
247248

248-
1. **New Agent Types**: Extend the `BaseAgent` class in `agents/base_agent.py`
249+
1. **New Harness Types**: Implement `ChatHarness` in `agents/chat_harness.py`. `BaseAgent` remains available only as a compatibility shim for legacy `process_message()` implementations.
249250
2. **Custom Prompts**: Add new templates in `templates/prompts/<agent_type>/`
250251
3. **UI Components**: Add new components in `templates/components/`
251252

253+
The application layer should own routing, persistence, idempotent turn lifecycle, and HTML rendering. The harness layer should own normalized request/result/failure contracts, observability metadata, prompt assembly, and provider-facing execution.
254+
252255
### Configuration
253256

254257
- Logging configuration can be modified in `utils/logging_config.py`
@@ -276,7 +279,7 @@ For the default no-auth baseline, keep `CORS_ALLOW_CREDENTIALS=false`. If you en
276279

277280
- Prompts: edit `templates/prompts/openai/` to change the default system or user prompt behavior.
278281
- Model and runtime settings: use environment variables first, then `utils/settings.py` if you need to change the supported configuration surface.
279-
- Provider wiring: edit `agents/openai_agent.py` to change OpenAI-specific request construction or swap in a different agent implementation behind the existing app contract.
282+
- Provider wiring: edit `agents/openai_agent.py` for OpenAI-specific request construction, or add a new implementation in `agents/` behind the `ChatHarness` contract without changing the route layer.
280283
- Chat UI behavior: edit `templates/components/chat.html`, `static/js/chat.js`, and `static/css/chat.css`.
281284
- Visual baselines: update `tests/e2e/snapshots/` only when a deliberate UI change is accepted.
282285

agents/__init__.py

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,29 @@
1-
from .base_agent import BaseAgent
1+
from .base_agent import (
2+
BaseAgent,
3+
ChatHarness,
4+
ChatHarnessCapabilities,
5+
ChatHarnessExecutionError,
6+
ChatHarnessEvent,
7+
ChatHarnessFailure,
8+
ChatHarnessIdentity,
9+
ChatHarnessObservability,
10+
ChatHarnessRequest,
11+
ChatHarnessResult,
12+
ConversationTurn,
13+
)
214
from .openai_agent import OpenAIAgent
315

4-
__all__ = ['BaseAgent', 'OpenAIAgent']
16+
__all__ = [
17+
"BaseAgent",
18+
"ChatHarness",
19+
"ChatHarnessCapabilities",
20+
"ChatHarnessExecutionError",
21+
"ChatHarnessEvent",
22+
"ChatHarnessFailure",
23+
"ChatHarnessIdentity",
24+
"ChatHarnessObservability",
25+
"ChatHarnessRequest",
26+
"ChatHarnessResult",
27+
"ConversationTurn",
28+
"OpenAIAgent",
29+
]

agents/base_agent.py

Lines changed: 26 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,27 @@
1-
from abc import ABC, abstractmethod
2-
from dataclasses import dataclass
3-
from typing import Literal, Sequence
1+
from .chat_harness import (
2+
BaseAgent,
3+
ChatHarness,
4+
ChatHarnessCapabilities,
5+
ChatHarnessExecutionError,
6+
ChatHarnessEvent,
7+
ChatHarnessFailure,
8+
ChatHarnessIdentity,
9+
ChatHarnessObservability,
10+
ChatHarnessRequest,
11+
ChatHarnessResult,
12+
ConversationTurn,
13+
)
414

5-
6-
@dataclass(frozen=True)
7-
class ConversationTurn:
8-
role: Literal["user", "assistant"]
9-
content: str
10-
11-
class BaseAgent(ABC):
12-
"""Abstract base class for all agents"""
13-
14-
@property
15-
@abstractmethod
16-
def display_name(self):
17-
"""Return the display name for the agent to be shown in the header"""
18-
pass
19-
20-
@property
21-
@abstractmethod
22-
def model_display_name(self):
23-
"""Return a user-friendly display name for the model"""
24-
pass
25-
26-
@abstractmethod
27-
def process_message(
28-
self,
29-
message: str,
30-
conversation_history: Sequence[ConversationTurn] | None = None,
31-
) -> str:
32-
"""Process a user message and return a response"""
33-
pass
15+
__all__ = [
16+
"BaseAgent",
17+
"ChatHarness",
18+
"ChatHarnessCapabilities",
19+
"ChatHarnessExecutionError",
20+
"ChatHarnessEvent",
21+
"ChatHarnessFailure",
22+
"ChatHarnessIdentity",
23+
"ChatHarnessObservability",
24+
"ChatHarnessRequest",
25+
"ChatHarnessResult",
26+
"ConversationTurn",
27+
]

agents/chat_harness.py

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
from __future__ import annotations
2+
3+
from abc import ABC, abstractmethod
4+
from dataclasses import dataclass, field
5+
from typing import Literal
6+
from collections.abc import Iterator, Sequence
7+
8+
9+
FailureCode = Literal[
10+
"rate_limited",
11+
"authentication_failed",
12+
"timeout",
13+
"connection_error",
14+
"invalid_request",
15+
"provider_error",
16+
"empty_response",
17+
"unexpected_error",
18+
]
19+
20+
EventType = Literal["output_text", "completed", "failed"]
21+
22+
23+
@dataclass(frozen=True)
24+
class ConversationTurn:
25+
role: Literal["user", "assistant"]
26+
content: str
27+
28+
29+
@dataclass(frozen=True)
30+
class ChatHarnessIdentity:
31+
key: str
32+
display_name: str
33+
model_display_name: str
34+
provider_name: str | None = None
35+
version: str | None = None
36+
37+
38+
@dataclass(frozen=True)
39+
class ChatHarnessCapabilities:
40+
supports_streaming: bool = False
41+
supports_tools: bool = False
42+
supports_context_builders: bool = False
43+
44+
45+
@dataclass(frozen=True)
46+
class ChatHarnessObservability:
47+
model: str | None = None
48+
provider: str | None = None
49+
request_id: str | None = None
50+
tags: dict[str, str] = field(default_factory=dict)
51+
52+
53+
@dataclass(frozen=True)
54+
class ChatHarnessFailure:
55+
code: FailureCode
56+
message: str
57+
retryable: bool
58+
detail: str | None = None
59+
60+
61+
@dataclass(frozen=True)
62+
class ChatHarnessRequest:
63+
message: str
64+
conversation_history: tuple[ConversationTurn, ...] = ()
65+
request_id: str | None = None
66+
chat_session_id: int | None = None
67+
client_id: str | None = None
68+
metadata: dict[str, str] = field(default_factory=dict)
69+
70+
def __post_init__(self) -> None:
71+
object.__setattr__(self, "conversation_history", tuple(self.conversation_history))
72+
object.__setattr__(self, "metadata", dict(self.metadata))
73+
74+
75+
@dataclass(frozen=True)
76+
class ChatHarnessResult:
77+
output_text: str | None = None
78+
finish_reason: str = "completed"
79+
failure: ChatHarnessFailure | None = None
80+
observability: ChatHarnessObservability = field(default_factory=ChatHarnessObservability)
81+
metadata: dict[str, str] = field(default_factory=dict)
82+
83+
def __post_init__(self) -> None:
84+
object.__setattr__(self, "metadata", dict(self.metadata))
85+
if self.failure is None and not self.output_text:
86+
raise ValueError("Successful harness results require output_text.")
87+
if self.failure is not None and self.output_text is not None:
88+
raise ValueError("Failed harness results cannot include output_text.")
89+
90+
91+
@dataclass(frozen=True)
92+
class ChatHarnessEvent:
93+
event_type: EventType
94+
output_text: str | None = None
95+
failure: ChatHarnessFailure | None = None
96+
observability: ChatHarnessObservability = field(default_factory=ChatHarnessObservability)
97+
sequence: int = 0
98+
metadata: dict[str, str] = field(default_factory=dict)
99+
100+
def __post_init__(self) -> None:
101+
object.__setattr__(self, "metadata", dict(self.metadata))
102+
103+
104+
class ChatHarnessExecutionError(RuntimeError):
105+
"""Raised when a harness fails with a normalized failure."""
106+
107+
def __init__(self, failure: ChatHarnessFailure):
108+
self.failure = failure
109+
super().__init__(failure.message)
110+
111+
112+
class ChatHarness(ABC):
113+
"""App-facing contract for harness implementations."""
114+
115+
@property
116+
@abstractmethod
117+
def identity(self) -> ChatHarnessIdentity:
118+
"""Return stable harness identity and display metadata."""
119+
120+
@property
121+
def capabilities(self) -> ChatHarnessCapabilities:
122+
return ChatHarnessCapabilities()
123+
124+
@abstractmethod
125+
def run(self, request: ChatHarnessRequest) -> ChatHarnessResult:
126+
"""Execute one harness request and return the normalized result."""
127+
128+
def run_events(self, request: ChatHarnessRequest) -> Iterator[ChatHarnessEvent]:
129+
result = self.run(request)
130+
if result.output_text is not None:
131+
yield ChatHarnessEvent(
132+
event_type="output_text",
133+
output_text=result.output_text,
134+
observability=result.observability,
135+
metadata=result.metadata,
136+
sequence=0,
137+
)
138+
if result.failure is not None:
139+
yield ChatHarnessEvent(
140+
event_type="failed",
141+
failure=result.failure,
142+
observability=result.observability,
143+
metadata=result.metadata,
144+
sequence=1,
145+
)
146+
return
147+
yield ChatHarnessEvent(
148+
event_type="completed",
149+
output_text=result.output_text,
150+
observability=result.observability,
151+
metadata=result.metadata,
152+
sequence=1,
153+
)
154+
155+
156+
class BaseAgent(ChatHarness, ABC):
157+
"""Compatibility layer for the legacy non-harness agent interface."""
158+
159+
@property
160+
@abstractmethod
161+
def display_name(self) -> str:
162+
"""Return the display name for the agent to be shown in the header."""
163+
164+
@property
165+
@abstractmethod
166+
def model_display_name(self) -> str:
167+
"""Return a user-friendly display name for the model."""
168+
169+
@property
170+
def identity(self) -> ChatHarnessIdentity:
171+
return ChatHarnessIdentity(
172+
key=self.__class__.__name__.lower(),
173+
display_name=self.display_name,
174+
model_display_name=self.model_display_name,
175+
)
176+
177+
def run(self, request: ChatHarnessRequest) -> ChatHarnessResult:
178+
try:
179+
response_text = self.process_message(
180+
request.message,
181+
request.conversation_history,
182+
)
183+
except ValueError:
184+
raise
185+
except Exception as exc:
186+
raise ChatHarnessExecutionError(self.normalize_exception(exc)) from exc
187+
return ChatHarnessResult(
188+
output_text=response_text,
189+
observability=ChatHarnessObservability(
190+
model=self.model_display_name,
191+
request_id=request.request_id,
192+
),
193+
)
194+
195+
def normalize_exception(self, exc: Exception) -> ChatHarnessFailure:
196+
return ChatHarnessFailure(
197+
code="unexpected_error",
198+
message="Harness execution failed.",
199+
retryable=False,
200+
detail=str(exc),
201+
)
202+
203+
@abstractmethod
204+
def process_message(
205+
self,
206+
message: str,
207+
conversation_history: Sequence[ConversationTurn] | None = None,
208+
) -> str:
209+
"""Process a user message and return a response."""

0 commit comments

Comments
 (0)