Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/agent-profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ Define the agent's role, responsibilities, and behavior here.
- `model` (string): AI model to use
- `permissionMode` (string, `claude_code` only): One of `"default"`, `"acceptEdits"`, `"plan"`, `"auto"`, `"bypassPermissions"`. When set, the `claude_code` provider passes `--permission-mode <value>` instead of `--dangerously-skip-permissions`. `cao launch --yolo` overrides this and forces bypass. See [Claude Code permission modes](https://code.claude.com/docs/en/permission-modes).
- `native_agent` (string, `claude_code` only): Name of a native Claude Code agent (`~/.claude/agents/`). When set, the provider passes `--agent <name>` directly and skips system prompt / MCP config decomposition (thin-wrapper mode). See [Claude Code native agent routing](claude-code.md#native-agent-routing).
- `codexProfile` (string, `codex` only): Names a `[profiles.<name>]` block in `~/.codex/config.toml`. When set, the provider drops `--yolo` and passes `--profile <name>` instead. See [Custom Codex Profile](codex-cli.md#custom-codex-profile).
- `codexConfig` (object, `codex` only): Inline Codex config overrides passed as `-c key=value` at launch (e.g. `model_reasoning_effort`, `service_tier`, `features.fast_mode`). Keys may be dotted config paths; values become TOML scalars. See [Inline Codex Config Overrides](codex-cli.md#inline-codex-config-overrides).
- `hermesProfile` (string, `hermes` only): Optional Hermes profile wrapper command CAO should launch instead of the default `hermes`, for example one created with `hermes profile alias test-worker`. This is intentionally separate from `codexProfile`: Codex consumes profile names via `codex --profile <name>`, while Hermes aliases are executable commands launched directly as `<alias> chat ...`. See [Hermes Provider](hermes.md).
- `prompt` (string): Additional prompt text

Expand Down
29 changes: 29 additions & 0 deletions docs/codex-cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,35 @@ sandbox_mode = "read-only"
approval_policy = "never"
```

### Inline Codex Config Overrides

The `codexConfig` field on an agent profile is a map of Codex config overrides that CAO passes as `-c key=value` flags at launch — the same mechanism used for `developer_instructions` and `mcpServers`. It lets a profile set per-agent Codex knobs (reasoning effort, service tier, fast mode, model, …) **without editing the global `~/.codex/config.toml` or maintaining named profile files**.

- **Keys** may be dotted paths into Codex's config schema (e.g. `model_reasoning_effort`, `service_tier`, `features.fast_mode`).
- **Values** are serialized to TOML scalars: strings are quoted, booleans and numbers are emitted bare. So `model_reasoning_effort: "xhigh"` becomes `-c model_reasoning_effort="xhigh"` and `features.fast_mode: true` becomes `-c features.fast_mode=true`.
- Overrides are applied in **both** the default `--yolo` path and the `--profile <codexProfile>` path, so effort/fast-mode knobs work whether or not a named profile governs sandbox/approvals.
- `codexConfig` **composes** with `codexProfile`. Because Codex applies CLI `-c` overrides last, a key set in both wins from `codexConfig`.
- Scope is per-session: nothing is written to the user's global `~/.codex/config.toml`.

Example — a developer agent pinned to high reasoning effort and fast mode:

```markdown
---
name: backend-developer
description: Backend developer agent
provider: codex
role: developer
codexConfig:
model_reasoning_effort: "xhigh"
service_tier: "fast"
features.fast_mode: true
---

You implement backend changes from a task spec.
```

This launches Codex as `codex --yolo … -c model_reasoning_effort="xhigh" -c service_tier="fast" -c features.fast_mode=true`, applying the effort and fast-mode settings to that agent only.

## Workflows

### 1. Interactive single-agent task
Expand Down
11 changes: 11 additions & 0 deletions src/cli_agent_orchestrator/models/agent_profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,17 @@ class AgentProfile(BaseModel):
# permission-floor knob.
codexProfile: Optional[str] = Field(default=None, min_length=1)

# Codex-only. Inline Codex config overrides passed as `-c key=value` at
# launch (e.g. {"model_reasoning_effort": "xhigh", "service_tier": "fast",
# "features.fast_mode": True}). Keys may be dotted paths into Codex's
# config.toml schema; values are serialized to TOML scalars (strings are
# quoted, bools/numbers emitted bare). Applied in both the default --yolo
# path and the --profile <codexProfile> path, so per-agent knobs like
# reasoning effort or fast mode need no global ~/.codex/config.toml edits
# or named profile files. Composes with codexProfile; because Codex applies
# CLI overrides last, these win on key conflicts.
codexConfig: Optional[Dict[str, Any]] = None

# Hermes-only. Optionally names a Hermes profile wrapper command (for
# example one created by `hermes profile alias <profile>`). When omitted,
# the Hermes provider launches the default `hermes` command.
Expand Down
64 changes: 63 additions & 1 deletion src/cli_agent_orchestrator/providers/codex.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import re
import shlex
import time
from typing import Optional
from typing import Any, Optional

from cli_agent_orchestrator.clients.tmux import tmux_client
from cli_agent_orchestrator.models.terminal import TerminalStatus
Expand Down Expand Up @@ -118,6 +118,57 @@ def _compute_tui_footer_cutoff(all_lines: list) -> int:
return len("\n".join(all_lines[:footer_start_idx]))


def _toml_scalar(value: Any) -> str:
"""Serialize a Python scalar to a TOML literal for a ``-c key=<value>`` override.

Strings become quoted TOML basic strings (backslash, quote, tab, CR, and newline escaped so
tmux ``send_keys`` keeps the launch command on one line); bools become
``true``/``false``; ints and floats are emitted bare. Non-scalar values (dict/list/None) raise ``TypeError`` so a misconfigured profile fails fast. ``bool`` is checked
before ``int`` because ``bool`` is a subclass of ``int`` in Python, so the
order here is load-bearing — a flipped order would render ``True`` as ``1``.
"""
if isinstance(value, bool):
return "true" if value else "false"
if isinstance(value, (int, float)):
return str(value)
if not isinstance(value, str):
raise TypeError(
"codexConfig values must be scalars (str, bool, int, or float); "
f"got {type(value).__name__}"
)
escaped = (
value.replace("\\", "\\\\")
.replace('"', '\\"')
.replace("\t", "\\t")
.replace("\r", "\\r")
.replace("\n", "\\n")
)
return f'"{escaped}"'
Comment on lines +121 to +146

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially agreed. The valuable part is real: silently accepting non-scalars (dict/list/None) would serialize a Python repr and fail confusingly in Codex, so _toml_scalar now raises TypeError for non-scalar values so a misconfigured profile fails fast. I also added \t/\r escaping for completeness.

That said, in practice codexConfig values are author-controlled config scalars (effort/service-tier strings, bool flags) rather than untrusted input, so the control-character path is more defensive than a real vector here.



_CODEX_CONFIG_KEY_PATTERN = re.compile(r"^[A-Za-z0-9_.-]+$")


def _toml_override(key: str, value: Any) -> str:
"""Build one ``key=<toml-scalar>`` Codex ``-c`` override, validating the key.

Keys must be non-empty dotted config paths over ``[A-Za-z0-9_.-]`` (e.g.
``features.fast_mode``); spaces, ``=``, quotes, or control characters are
rejected so a misconfigured profile fails fast instead of silently emitting
a malformed ``-c`` override. Value-serialization failures from
:func:`_toml_scalar` are re-raised with the offending key for context.
"""
if not isinstance(key, str) or not _CODEX_CONFIG_KEY_PATTERN.match(key):
raise ValueError(
f"Invalid codexConfig key {key!r}: must be a dotted config path over "
"[A-Za-z0-9_.-] (e.g. 'features.fast_mode')"
)
try:
return f"{key}={_toml_scalar(value)}"
except TypeError as exc:
raise TypeError(f"codexConfig key '{key}': {exc}") from exc


def _find_assistant_marker(text: str) -> Optional[re.Match[str]]:
"""Find the first ASSISTANT_PREFIX_PATTERN match in ``text`` whose line
is not an MCP tool-call marker.
Expand Down Expand Up @@ -254,6 +305,17 @@ def _build_codex_command(self) -> str:
if "tool_timeout_sec" not in cfg:
command_parts.extend(["-c", f"{prefix}.tool_timeout_sec=600.0"])

# Inline Codex config overrides (-c key=value). Lets a profile set
# per-agent Codex knobs — reasoning effort, service tier, fast mode,
# etc. — without editing the global ~/.codex/config.toml or
# maintaining named profile files. Keys may be dotted config paths
# (e.g. "features.fast_mode"); values are serialized to TOML
# scalars. Emitted last so they take precedence over CAO's own
# overrides and the profile/config defaults on key conflicts.
if profile.codexConfig:
for key, value in profile.codexConfig.items():
command_parts.extend(["-c", _toml_override(key, value)])

return shlex.join(command_parts)

def _handle_trust_prompt(self, timeout: float = 20.0) -> None:
Expand Down
163 changes: 162 additions & 1 deletion test/providers/test_codex_provider_unit.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@
import pytest

from cli_agent_orchestrator.models.terminal import TerminalStatus
from cli_agent_orchestrator.providers.codex import CodexProvider, ProviderError
from cli_agent_orchestrator.providers.codex import (
CodexProvider,
ProviderError,
_toml_override,
_toml_scalar,
)

FIXTURES_DIR = Path(__file__).parent / "fixtures"

Expand Down Expand Up @@ -431,6 +436,162 @@ def test_yolo_overrides_codex_profile(self, mock_load):
assert "--profile" not in command


class TestTomlScalar:
"""Tests for ``_toml_scalar`` TOML-literal serialization."""

def test_string_is_quoted(self):
assert _toml_scalar("xhigh") == '"xhigh"'

def test_bool_true_is_bare(self):
assert _toml_scalar(True) == "true"

def test_bool_false_is_bare(self):
assert _toml_scalar(False) == "false"

def test_bool_checked_before_int(self):
# bool is a subclass of int; True must render as "true", not "1".
assert _toml_scalar(True) == "true"
assert _toml_scalar(1) == "1"

def test_int_is_bare(self):
assert _toml_scalar(600) == "600"

def test_float_is_bare(self):
assert _toml_scalar(600.0) == "600.0"

def test_string_escapes_quotes_and_backslashes(self):
assert _toml_scalar('a"b\\c') == '"a\\"b\\\\c"'

def test_string_escapes_newlines(self):
# Literal newlines would split the tmux command across lines.
assert "\n" not in _toml_scalar("line1\nline2")
assert _toml_scalar("line1\nline2") == '"line1\\nline2"'

def test_string_escapes_tabs_and_carriage_returns(self):
assert _toml_scalar("a\tb\rc") == '"a\\tb\\rc"'

@pytest.mark.parametrize("value", [{"a": 1}, ["x"], None])
def test_rejects_non_scalar(self, value):
with pytest.raises(TypeError):
_toml_scalar(value)


class TestTomlOverride:
"""Tests for ``_toml_override`` key validation."""

def test_builds_override_for_valid_dotted_key(self):
assert _toml_override("features.fast_mode", True) == "features.fast_mode=true"
assert _toml_override("model_reasoning_effort", "xhigh") == 'model_reasoning_effort="xhigh"'

@pytest.mark.parametrize("key", ["bad key", "a=b", 'k"x', "key\ninjected", "", "a/b"])
def test_rejects_unsafe_key(self, key):
# Unsafe keys would produce a malformed -c override or split the tmux
# command across lines; fail fast instead.
with pytest.raises(ValueError, match="Invalid codexConfig key"):
_toml_override(key, "v")

def test_non_scalar_value_error_names_offending_key(self):
with pytest.raises(TypeError, match="codexConfig key 'features.x'"):
_toml_override("features.x", {"nested": 1})


class TestCodexProviderCodexConfig:
"""Tests that profile.codexConfig emits inline ``-c key=value`` overrides."""

@patch("cli_agent_orchestrator.providers.codex.load_agent_profile")
def test_codex_config_emits_c_overrides_in_yolo_path(self, mock_load):
mock_profile = MagicMock()
mock_profile.model = None
mock_profile.system_prompt = None
mock_profile.mcpServers = None
mock_profile.codexProfile = None
mock_profile.codexConfig = {
"model_reasoning_effort": "xhigh",
"service_tier": "fast",
"features.fast_mode": True,
}
mock_load.return_value = mock_profile

provider = CodexProvider("tid", "sess", "win", "agent")
command = provider._build_codex_command()

# Default --yolo path is kept; overrides are appended as -c key=value.
# String values are shlex-quoted (the inner key="value" is preserved);
# the bool value is emitted bare.
assert "--yolo" in command
assert 'model_reasoning_effort="xhigh"' in command
assert 'service_tier="fast"' in command
assert "features.fast_mode=true" in command

@patch("cli_agent_orchestrator.providers.codex.load_agent_profile")
def test_codex_config_composes_with_codex_profile(self, mock_load):
# codexConfig must apply in the --profile path too, so effort/fast-mode
# knobs work whether or not a named profile governs sandbox/approvals.
mock_profile = MagicMock()
mock_profile.model = None
mock_profile.system_prompt = None
mock_profile.mcpServers = None
mock_profile.codexProfile = "cao_reviewer"
mock_profile.codexConfig = {"model_reasoning_effort": "high"}
mock_load.return_value = mock_profile

provider = CodexProvider("tid", "sess", "win", "agent")
command = provider._build_codex_command()

assert "--profile cao_reviewer" in command
assert "--yolo" not in command
assert 'model_reasoning_effort="high"' in command

@patch("cli_agent_orchestrator.providers.codex.load_agent_profile")
def test_codex_config_none_emits_no_overrides(self, mock_load):
mock_profile = MagicMock()
mock_profile.model = None
mock_profile.system_prompt = None
mock_profile.mcpServers = None
mock_profile.codexProfile = None
mock_profile.codexConfig = None
mock_load.return_value = mock_profile

provider = CodexProvider("tid", "sess", "win", "agent")
command = provider._build_codex_command()

assert command == "codex --yolo --no-alt-screen --disable shell_snapshot"

@patch("cli_agent_orchestrator.providers.codex.load_agent_profile")
def test_codex_config_empty_dict_emits_no_overrides(self, mock_load):
mock_profile = MagicMock()
mock_profile.model = None
mock_profile.system_prompt = None
mock_profile.mcpServers = None
mock_profile.codexProfile = None
mock_profile.codexConfig = {}
mock_load.return_value = mock_profile

provider = CodexProvider("tid", "sess", "win", "agent")
command = provider._build_codex_command()

assert command == "codex --yolo --no-alt-screen --disable shell_snapshot"

@patch("cli_agent_orchestrator.providers.codex.load_agent_profile")
def test_codex_config_composes_with_mcp_and_model(self, mock_load):
# Regression guard: codexConfig overrides sit alongside the model flag
# and the -c mcp_servers... wiring without clobbering either.
mock_profile = MagicMock()
mock_profile.model = "gpt-5.5"
mock_profile.system_prompt = None
mock_profile.mcpServers = {"cao-mcp-server": {"command": "uvx", "args": ["cao-mcp-server"]}}
mock_profile.codexProfile = None
mock_profile.codexConfig = {"model_reasoning_effort": "xhigh"}
mock_load.return_value = mock_profile

provider = CodexProvider("tid", "sess", "win", "agent")
command = provider._build_codex_command()

assert "--model gpt-5.5" in command
assert "mcp_servers.cao-mcp-server.command=" in command
assert 'model_reasoning_effort="xhigh"' in command


class TestCodexProviderStatusDetection:
@patch("cli_agent_orchestrator.providers.codex.tmux_client")
def test_get_status_idle(self, mock_tmux):
Expand Down
46 changes: 45 additions & 1 deletion test/utils/test_agent_profiles.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,11 @@
import pytest

from cli_agent_orchestrator.models.agent_profile import AgentProfile
from cli_agent_orchestrator.utils.agent_profiles import load_agent_profile, resolve_provider
from cli_agent_orchestrator.utils.agent_profiles import (
load_agent_profile,
parse_agent_profile_text,
resolve_provider,
)


class TestLoadAgentProfile:
Expand Down Expand Up @@ -622,3 +626,43 @@ def test_load_agent_profile_builtin_store_fallback_resolves_vars(
assert profile.system_prompt == "Body token: builtin-secret"
assert profile.mcpServers is not None
assert profile.mcpServers["service"]["env"]["API_TOKEN"] == "builtin-secret"


class TestCodexConfigParsing:
"""codexConfig frontmatter parses into the AgentProfile field."""

def test_codex_config_parses_dotted_keys_and_mixed_value_types(self):
text = (
"---\n"
"name: codex-agent\n"
"description: Codex agent with inline config\n"
"provider: codex\n"
"codexConfig:\n"
' model_reasoning_effort: "xhigh"\n'
' service_tier: "fast"\n'
" features.fast_mode: true\n"
"---\n"
"System prompt content"
)

profile = parse_agent_profile_text(text, "codex-agent")

assert profile.codexConfig == {
"model_reasoning_effort": "xhigh",
"service_tier": "fast",
"features.fast_mode": True,
}

def test_codex_config_defaults_to_none_when_absent(self):
text = (
"---\n"
"name: codex-agent\n"
"description: Codex agent without inline config\n"
"provider: codex\n"
"---\n"
"System prompt content"
)

profile = parse_agent_profile_text(text, "codex-agent")

assert profile.codexConfig is None