Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/agent-tool-gating.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ Wiring (schema/handler layers plus registration order):

The plugin loads the same contract as the adapter (`hermes-governance` / bundled
YAML). Adding a tool is mostly **config + mapper + policy**, plus an entry in
`GOVERNED_BUILTIN_MODULES` when Hermes registers the tool at module import time.
``builtin_module`` in the repo catalog template when Hermes registers the tool at module import time.

**History (before v1):** the first proof gated only `terminal` via
`intentframe-terminal`, which imported `tools.terminal_tool` at plugin load (early
Expand Down
55 changes: 32 additions & 23 deletions docs/hermes-intentframe-integration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,9 @@ Names only — full JSON schemas are probed separately via ``probe_hermes_tool_s

**Rule:** when debugging “model never calls tool X”, verify X appears in the **OpenAI
Tools block** (request dump with `HERMES_DUMP_REQUESTS=1`, trace, or gateway logs),
not only on `/v1/toolsets`. Automated check:
not only on `/v1/toolsets`. For governed builtins, also check preload (`builtin_module`
in yaml) and Hermes per-tool `check_fn` env (e.g. `cronjob` needs `HERMES_GATEWAY_SESSION`).
Automated check:
`RUN_HERMES_GATEWAY_TOOLSETS=1 ./tests/scripts/test-hermes-gateway-toolsets.sh`
(see [`tests/hermes_gateway/README.md`](../tests/hermes_gateway/README.md#toolsets--provider-payload-test-opt-in-networked-llm)).

Expand Down Expand Up @@ -271,19 +273,23 @@ unless explicitly added to the contract — govern by **tool name**, not toolset

### 1. Selective preload

```12:29:integrations/hermes/plugin/intentframe-gate/builtin_preload.py
GOVERNED_BUILTIN_MODULES: dict[str, str] = {
"terminal": "tools.terminal_tool",
"process": "tools.process_registry",
"write_file": "tools.file_tools",
"patch": "tools.file_tools",
}
Each catalog tool may declare ``builtin_module`` in the dev-owned repo
``integrations/hermes/governance/tools.yaml`` (copied to runtime on integrate).
The plugin imports those modules for **enabled** governed tools only:

def preload_governed_builtins(governed: frozenset[str]) -> None:
...
importlib.import_module(module_name)
```yaml
terminal:
enabled: true
builtin_module: tools.terminal_tool
cronjob:
enabled: true
builtin_module: tools.cronjob_tools
```

[`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py)
validates ``builtin_module`` must start with ``tools.`` and imports unique modules before
the registry snapshot.

**Why not call `discover_builtin_tools()` in the plugin?**

Hermes discovers builtins by AST-scanning `tools/*.py` and importing every module
Expand Down Expand Up @@ -403,7 +409,7 @@ the modules first.

### Scrutinize import changes like API changes

When editing `GOVERNED_BUILTIN_MODULES` or any plugin import:
When editing ``builtin_module`` in the repo catalog template or any plugin import:

| Change | Risk |
|--------|------|
Expand Down Expand Up @@ -507,15 +513,18 @@ do not change manifest or policy files.

### Step 4 — Plugin preload (if Hermes builtin)

If the tool is a Hermes built-in registered at import time, add to
`GOVERNED_BUILTIN_MODULES`:
If the tool is a Hermes built-in registered at import time, set in repo
``integrations/hermes/governance/tools.yaml``:

```python
"my_tool": "tools.my_tool_module",
```yaml
my_tool:
enabled: true
builtin_module: tools.my_tool_module
```

If several catalog names share one module (like `write_file` + `patch` → `file_tools`),
one import is enough — preload dedupes modules.
one import is enough — preload dedupes modules. ``builtin_module`` must start with
``tools.`` (validated at load time).

Delete coverage is via `patch` V4A `*** Delete File:` operations (maps to `DELETE_HOST_FILE`).

Expand Down Expand Up @@ -658,7 +667,7 @@ uv run --package intentframe-integrations-cli python tests/intentframe_integrati
uv run --package intentframe-integrations-cli python tests/intentframe_integrations/test_policy_manage.py
```

Extend `test_builtin_preload.py` when adding `GOVERNED_BUILTIN_MODULES` entries.
Extend `test_builtin_preload.py` when adding ``builtin_module`` entries to the catalog template.

### Layer 2 — Toolsets + OpenAI provider payload (networked LLM)

Expand All @@ -669,14 +678,14 @@ RUN_HERMES_GATEWAY_TOOLSETS=1 ./tests/scripts/test-hermes-gateway-toolsets.sh
Requires `OPENAI_API_KEY`. After `integrate hermes`:

1. `GET /v1/toolsets` — config tool name surface
2. `probe_hermes_tool_schemas.py` — registry schemas (`reason_required`, gate markers)
2. `probe_hermes_tool_schemas.py` — registry schemas for **all** governed catalog tools (`reason_required`, gate markers); probe env includes `HERMES_GATEWAY_SESSION=1` so `cronjob` passes Hermes `check_fn`
3. `POST /v1/responses` with `HERMES_DUMP_REQUESTS=1` — one real `chat.completions` call
4. Assert token usage > 0 and governed tools have required `reason` in `request.body.tools`
4. Assert token usage > 0 and **all** governed catalog tools have required `reason` in `request.body.tools`

Asserts `terminal: ['process', 'terminal']` on toolsets and provider payload schema
for governed tools. Lighter than full E2E (no tool-calling ALLOW/BLOCK probes).
Lighter than full E2E (no tool-calling ALLOW/BLOCK probes). Covers generic mappers
(e.g. `cronjob`) that gateway E2E omits from LLM probes.

Details: [`tests/hermes_gateway/README.md`](../tests/hermes_gateway/README.md#toolsets--provider-payload-test-opt-in-networked-llm).
Details and recent bug fixes: [`tests/hermes_gateway/README.md`](../tests/hermes_gateway/README.md#toolsets--provider-payload-test-opt-in-networked-llm).

### Layer 3 — Scoped gateway E2E (fast smoke)

Expand Down
19 changes: 14 additions & 5 deletions docs/hermes-intentframe-state-report.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# IntentFrame × Hermes integration — state report

> Snapshot of the Hermes agent integration as of **2026-06-23**. For how-to and
> Snapshot of the Hermes agent integration as of **2026-06-24**. For how-to and
> troubleshooting, see [`hermes-intentframe-integration-guide.md`](./hermes-intentframe-integration-guide.md).

---
Expand Down Expand Up @@ -45,7 +45,7 @@ LLM (POST /v1/responses)

| Path | Purpose |
|------|---------|
| `integrations/hermes/governance/tools.yaml` | Default governed-tool **template** (4 entries) |
| `integrations/hermes/governance/tools.yaml` | Default governed-tool **template** (5 entries) |
| `integrations/hermes/policy.yaml` | Shipped policy **template** (RUN_COMMAND + host-file + deletion) |
| `~/.intentframe/integrations/hermes/governance/tools.yaml` | Runtime governed-tool config (user-owned) |
| `~/.intentframe/integrations/hermes/policy.yaml` | Runtime policy config (user-owned) |
Expand Down Expand Up @@ -89,12 +89,16 @@ At `register()`:

1. **`install_registry_hook()`** — wrap tools registered later (e.g. MCP refresh).
2. **`preload_governed_builtins(governed)`** — selective import from
`GOVERNED_BUILTIN_MODULES` in [`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py):
``builtin_module`` per tool in [`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py) (from repo ``tools.yaml``):
- `terminal` → `tools.terminal_tool`
- `process` → `tools.process_registry`
- `write_file`, `patch` → `tools.file_tools`
- `cronjob` → `tools.cronjob_tools`
3. **Snapshot loop** — wrap governed entries with `inject_reason()` + `gate_tool_call()`.

`cronjob` also requires `HERMES_GATEWAY_SESSION=1` (or interactive/exec env) to pass
Hermes `check_fn` filtering in `get_tool_definitions()` — preload alone is not enough.

See [`hermes-plugin-registration-order.md`](./hermes-plugin-registration-order.md) for
load-order evidence and bisect notes.

Expand Down Expand Up @@ -136,7 +140,7 @@ or restore defaults. Policy commands apply `agent.json` env via `load_and_activa
| Layer | Entry | LLM / network |
|-------|-------|---------------|
| Unit | `tests/hermes_plugin/`, `tests/hermes_gateway/test_*.py`, adapter tests, `test_policy_manage.py`, `test_integration_pack.py` | No |
| Toolsets + provider payload | `RUN_HERMES_GATEWAY_TOOLSETS=1 ./tests/scripts/test-hermes-gateway-toolsets.sh` | OpenAI `chat.completions` (one round-trip); asserts `tools=` + `reason` in request dump |
| Toolsets + provider payload | `RUN_HERMES_GATEWAY_TOOLSETS=1 ./tests/scripts/test-hermes-gateway-toolsets.sh` | OpenAI `chat.completions` (one round-trip); asserts **all** governed tools + `reason` in request dump |
| Live integration | `./tests/scripts/test-hermes-integration.sh` | Backend; policy reload smoke + adapter/plugin probes (no LLM) |
| Gateway E2E | `RUN_HERMES_GATEWAY_E2E=1 ./tests/scripts/test-hermes-gateway-e2e.sh` | OpenAI + full stack; native-mapper LLM probes only |

Expand Down Expand Up @@ -167,7 +171,7 @@ See [`tests/hermes_gateway/README.md`](../tests/hermes_gateway/README.md).

---

## Recent changes (branch `fix-plugin-new-mechanism`)
## Recent changes

| Change | Rationale |
|--------|-----------|
Expand All @@ -177,6 +181,11 @@ See [`tests/hermes_gateway/README.md`](../tests/hermes_gateway/README.md).
| Hardened block probe prompts | Fix LLM rewriting `/etc/` to sandbox paths |
| `load_and_activate_pack` + policy env parity | Policy validation sees same manifest env as backend boot |
| `cronjob` generic tool + two-tier probe contract | Live semantic smoke; no gateway LLM E2E for generic mappers |
| **`builtin_module` in repo `tools.yaml`** | Replace hardcoded preload dict; single catalog source for Hermes import paths |
| **Toolsets live: full governed catalog** | Probe/dump had used native E2E tier only — `cronjob` was skipped despite production governance |
| **Toolsets probe: `HERMES_GATEWAY_SESSION=1`** | `cronjob` registered via preload but filtered by Hermes `check_fn` without gateway session env |
| **Toolsets run marker** | Unique token per run for OpenAI Platform log correlation |
| **Loader parity test (`builtin_module`)** | Plugin and shared governance loaders must agree on catalog shape |

---

Expand Down
55 changes: 30 additions & 25 deletions docs/hermes-plugin-registration-order.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,25 +66,29 @@ def discover_builtin_tools(tools_dir: Optional[Path] = None) -> List[str]:
| **`intentframe-gate` (broken)** | Hook + snapshot only — **no preload** | **empty** → `wrapped = []` |
| **`intentframe-gate` (fixed)** | `preload_governed_builtins(governed)` + generic snapshot | governed names present |

The fixed plugin restores the old **early-import effect** generically:

```12:29:integrations/hermes/plugin/intentframe-gate/builtin_preload.py
GOVERNED_BUILTIN_MODULES: dict[str, str] = {
"terminal": "tools.terminal_tool",
"process": "tools.process_registry",
"write_file": "tools.file_tools",
"patch": "tools.file_tools",
}
...
importlib.import_module(module_name)
The fixed plugin restores the old **early-import effect** generically from the dev-owned
catalog template:

```yaml
# integrations/hermes/governance/tools.yaml (excerpt)
terminal:
enabled: true
builtin_module: tools.terminal_tool
cronjob:
enabled: true
builtin_module: tools.cronjob_tools
```

[`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py)
imports each enabled tool's ``builtin_module`` (must start with ``tools.``) before snapshot.

Then the same wrap loop runs for every governed name — no terminal-specific
`_register_terminal_override()`:

```20:35:integrations/hermes/plugin/intentframe-gate/__init__.py
governed = governed_tool_names()
preload_governed_builtins(governed)
governed_tools = load_governed_tools()
governed = frozenset(governed_tools)
preload_governed_builtins(governed_tools)

for entry in registry._snapshot_entries():
if entry.name not in governed:
Expand All @@ -97,10 +101,11 @@ Then the same wrap loop runs for every governed name — no terminal-specific

### Why this matters for code review

Treat changes to `GOVERNED_BUILTIN_MODULES` and any plugin `import` like **API
surface changes**:
Treat changes to ``builtin_module`` in the repo ``tools.yaml`` and any plugin ``import``
like **API surface changes**:

- Removing an import line can remove a tool from the OpenAI payload entirely.
- Removing or omitting ``builtin_module`` can remove a tool from the OpenAI payload entirely.
- Invalid ``builtin_module`` values are rejected (must start with ``tools.``).
- Adding `discover_builtin_tools()` can register unrelated tools (`read_terminal`).
- Unit test: [`tests/hermes_plugin/test_builtin_preload.py`](../tests/hermes_plugin/test_builtin_preload.py).

Expand Down Expand Up @@ -154,7 +159,7 @@ enough on the gateway path — governed Hermes builtins must be preloaded first.
```mermaid
flowchart TB
subgraph mechanisms["intentframe-gate registration"]
A["1. Selective preload<br/>GOVERNED_BUILTIN_MODULES"]
A["1. Selective preload<br/>tools.yaml builtin_module"]
B["2. Snapshot loop<br/>registry._snapshot_entries()"]
C["3. Registry hook<br/>patch registry.register"]
end
Expand Down Expand Up @@ -183,8 +188,8 @@ flowchart TB
**Why not full `discover_builtin_tools()`?** It imports every builtin module.
That pulled in `read_terminal`, which Hermes then merged into the `terminal` toolset
and broke the E2E toolsets contract (`['process', 'terminal']` expected). Selective
preload imports only modules listed in `GOVERNED_BUILTIN_MODULES` for names in the
runtime governed set.
preload imports ``builtin_module`` from each **enabled** governed tool in the dev-owned
catalog template (copied to runtime on integrate).

---

Expand Down Expand Up @@ -305,7 +310,7 @@ registered a gated override — same **early import + wrap** effect as preload t
```python
install_registry_hook()
governed = governed_tool_names()
preload_governed_builtins(governed) # GOVERNED_BUILTIN_MODULES
preload_governed_builtins(governed_tools) # yaml builtin_module per enabled tool

for entry in registry._snapshot_entries():
if entry.name not in governed:
Expand Down Expand Up @@ -402,21 +407,21 @@ Unit tests: [`tests/hermes_plugin/test_builtin_preload.py`](../tests/hermes_plug

| File | Role |
|------|------|
| [`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py) | `GOVERNED_BUILTIN_MODULES` map + selective `importlib.import_module` |
| [`builtin_preload.py`](../integrations/hermes/plugin/intentframe-gate/builtin_preload.py) | Preload from yaml ``builtin_module`` + selective ``importlib.import_module`` |
| [`schema.py`](../integrations/hermes/plugin/intentframe-gate/schema.py) | `inject_reason()` — terminal-specific reason text branch |
| [`gate.py`](../integrations/hermes/plugin/intentframe-gate/gate.py) | Validate via adapter, strip `reason`, delegate |
| [`registry_hook.py`](../integrations/hermes/plugin/intentframe-gate/registry_hook.py) | Patch `registry.register` for dynamic tools |

When adding a governed Hermes **builtin**, add its import module to
`GOVERNED_BUILTIN_MODULES` (see [`test_builtin_preload.py`](../tests/hermes_plugin/test_builtin_preload.py)).
When adding a governed Hermes **builtin**, set ``builtin_module: tools.<module>`` in the
repo catalog template (see [`test_builtin_preload.py`](../tests/hermes_plugin/test_builtin_preload.py)).

---

## Implications for other governed tools

| Tool | Gateway E2E | Registration note |
|------|-------------|-------------------|
| `terminal`, `process`, `write_file`, `patch` | Probed when in scoped yaml | Listed in `GOVERNED_BUILTIN_MODULES` — preload + snapshot |
| `terminal`, `process`, `write_file`, `patch`, `cronjob` | Probed when in scoped yaml | ``builtin_module`` in repo ``tools.yaml`` — preload + snapshot |

Delete coverage uses `patch` V4A `*** Delete File:` ops (maps to `DELETE_HOST_FILE`).

Expand All @@ -426,7 +431,7 @@ If a governed tool fails with “model never calls tool X”:
2. If X is on `/v1/toolsets` but **not** in the OpenAI Tools list, the registry /
`get_definitions()` path dropped it (missing entry or failed `check_fn`).
3. Check plugin register logs for `wrapped` — empty means preload map may be missing X.
4. Add X to `GOVERNED_BUILTIN_MODULES` if Hermes registers it at module import time.
4. Set ``builtin_module: tools.<module>`` in the repo catalog template if Hermes registers it at module import time.

**Hermes-native long-term fix:** gateway could call `discover_builtin_tools()` before
`discover_plugins()` (upstream). Until then, the plugin owns selective preload.
Expand Down
25 changes: 23 additions & 2 deletions integrations/hermes/governance/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

| File | Owner | Purpose |
|------|-------|---------|
| `tools.yaml` (repo) | **Dev** | Tool catalog: names, mappers, action IDs, default `enabled` |
| `tools.yaml` (repo) | **Dev** | Tool catalog: names, mappers, action IDs, default `enabled`, `builtin_module` (Hermes preload import path) |
| `tools.yaml` (runtime) | **User** | Same catalog; user toggles `enabled` via `governance enable\|disable` |
| `generic_actions.manifest` (repo) | **Dev** | Static list of all `mapper: generic` action IDs (full catalog superset) |
| `generic_actions.manifest` (runtime) | **Copied once** | Seeded on `integrate hermes`; never overwritten by automation |
Expand Down Expand Up @@ -38,11 +38,32 @@ Verify after edits:

```bash
uv run --package intentframe-integrations-cli python tests/intentframe_integrations/test_actions_manifest.py
uv run --package intentframe-integrations-cli python tests/hermes_plugin/test_gate.py
uv run --package intentframe-integrations-cli python tests/hermes_plugin/test_builtin_preload.py
uv run --package intentframe-integrations-cli python integrations/hermes/shared/tests/test_governance.py
```

### `builtin_module` (preload map)

Each governed Hermes builtin declares `builtin_module: tools.<module>` in repo
`tools.yaml`. The intentframe-gate plugin imports unique modules for **enabled**
tools before registry snapshot (see `builtin_preload.py`). Values must start with
`tools.` — validated by both plugin and shared loaders.

**Why yaml, not Python:** a hardcoded preload dict drifted from the catalog (e.g.
`cronjob` governed in yaml but easy to omit from code). Yaml is the single source;
`test_plugin_loader_matches_shared_template` asserts plugin/shared parity including
`builtin_module`.

**`cronjob` nuance:** preload registers the tool, but Hermes `get_tool_definitions()`
also applies `check_cronjob_requirements()` — requires `HERMES_GATEWAY_SESSION=1`
(or interactive/exec env). The toolsets schema probe sets session env to mirror the
gateway; see `tests/hermes_gateway/README.md` (Recent fixes).

## Dev workflow (adding a generic tool)

1. Add entry to `tools.yaml` with `mapper: generic` and a `HERMES_*` action ID.
1. Add entry to `tools.yaml` with `mapper: generic`, a `HERMES_*` action ID, and
`builtin_module: tools.<module>` when Hermes registers the tool at import time.
2. Regenerate committed `generic_actions.manifest` to include the new action ID
(golden test `tests/intentframe_integrations/test_actions_manifest.py` enforces parity).
3. Update `agent.json` `action_types`, shipped `policy.yaml`, and `executor.yaml`
Expand Down
8 changes: 8 additions & 0 deletions integrations/hermes/governance/tools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
# enabled: true → IntentFrame gates at runtime (plugin wrap + adapter validate).
# enabled: false → spec kept in catalog; Hermes runs the tool ungoverned (no intent sent).
#
# builtin_module → Hermes module that registers this tool at import time (dev-owned preload map).
# Must start with "tools."; plugin imports only for enabled tools before registry snapshot.
#
# User control: intentframe-integrations governance enable|disable hermes <tool>
# writes ONLY the runtime copy (~/.intentframe/.../governance/tools.yaml).
# Restart Hermes gateway + adapter after toggling (governance is cached at process start).
Expand All @@ -22,20 +25,23 @@ tools:
risk: local_process
mapper: terminal
blocked_response: terminal_json
builtin_module: tools.terminal_tool

process:
enabled: true
action: RUN_COMMAND
risk: local_process
mapper: process
blocked_response: generic_json
builtin_module: tools.process_registry

write_file:
enabled: true
action: WRITE_HOST_FILE
risk: local_write
mapper: write_file
blocked_response: generic_json
builtin_module: tools.file_tools

patch:
enabled: true
Expand All @@ -44,10 +50,12 @@ tools:
risk: local_write
mapper: patch
blocked_response: generic_json
builtin_module: tools.file_tools

cronjob:
enabled: true
action: HERMES_CRONJOB
risk: local_process
mapper: generic
blocked_response: generic_json
builtin_module: tools.cronjob_tools
Loading
Loading