Skip to content

Commit c55d52c

Browse files
committed
feat: add financial governance evaluators (spend limits + transaction policy)
Implements the financial governance evaluator proposed in #129, following the technical guidance from @lan17: 1. Decoupled from data source — SpendStore protocol with pluggable backends (InMemorySpendStore included, PostgreSQL/Redis via custom implementation) 2. No new tables in core agent control — self-contained contrib package 3. Context-aware limits — channel/agent/session overrides via evaluate metadata 4. Python SDK compatible — standard Evaluator interface, works with both server and SDK evaluation engine Two evaluators: - financial_governance.spend_limit: Cumulative spend tracking with per-transaction caps and rolling period budgets - financial_governance.transaction_policy: Static policy enforcement (currency allowlists, recipient blocklists, amount bounds) 53 tests passing. Closes #129 Signed-off-by: up2itnow0822 <up2itnow0822@users.noreply.github.com> Signed-off-by: up2itnow0822 <up2itnow0822@gmail.com> Signed-off-by: up2itnow0822 <up2itnow0822@users.noreply.github.com>
1 parent da05f98 commit c55d52c

13 files changed

Lines changed: 1716 additions & 0 deletions

File tree

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
# Financial Governance Evaluators for Agent Control
2+
3+
Evaluators that enforce financial spend limits and transaction policies for autonomous AI agents.
4+
5+
As agents transact autonomously via protocols like [x402](https://github.com/coinbase/x402) and payment layers like [agentpay-mcp](https://github.com/AI-Agent-Economy/agentpay-mcp), enterprises need governance over what agents spend. These evaluators bring financial policy enforcement into the Agent Control framework.
6+
7+
## Evaluators
8+
9+
### `financial_governance.spend_limit`
10+
11+
Tracks cumulative agent spend and enforces rolling budget limits. Stateful — records approved transactions and checks new ones against accumulated spend.
12+
13+
- **Per-transaction cap** — reject any single payment above a threshold
14+
- **Rolling period budget** — reject payments that would exceed a time-windowed budget
15+
- **Context-aware overrides** — different limits per channel, agent, or session via evaluate metadata
16+
- **Pluggable storage** — abstract `SpendStore` protocol with built-in `InMemorySpendStore`; bring your own PostgreSQL, Redis, etc.
17+
18+
### `financial_governance.transaction_policy`
19+
20+
Static policy checks with no state tracking. Enforces structural rules on individual transactions.
21+
22+
- **Currency allowlist** — only permit specific currencies (e.g., `["USDC", "USDT"]`)
23+
- **Recipient blocklist/allowlist** — control which addresses an agent can pay
24+
- **Amount bounds** — minimum and maximum per-transaction limits
25+
26+
## Installation
27+
28+
```bash
29+
# From the repo root (development)
30+
cd evaluators/contrib/financial-governance
31+
pip install -e ".[dev]"
32+
```
33+
34+
## Configuration
35+
36+
### Spend Limit
37+
38+
```yaml
39+
evaluators:
40+
- type: financial_governance.spend_limit
41+
config:
42+
max_per_transaction: 100.0 # Max USDC per single payment
43+
max_per_period: 1000.0 # Rolling 24h budget
44+
period_seconds: 86400 # Budget window (default: 24 hours)
45+
currency: USDC # Currency to govern
46+
```
47+
48+
### Transaction Policy
49+
50+
```yaml
51+
evaluators:
52+
- type: financial_governance.transaction_policy
53+
config:
54+
allowed_currencies: [USDC, USDT]
55+
blocked_recipients: ["0xDEAD..."]
56+
allowed_recipients: ["0xALICE...", "0xBOB..."]
57+
min_amount: 0.01
58+
max_amount: 5000.0
59+
```
60+
61+
## Input Data Schema
62+
63+
Both evaluators expect a dict with these fields:
64+
65+
```python
66+
{
67+
"amount": 50.0, # required — transaction amount
68+
"currency": "USDC", # required — payment currency
69+
"recipient": "0xABC...", # required — payment recipient
70+
# optional context (used for channel-specific overrides and logging)
71+
"channel": "slack-trading",
72+
"agent_id": "agent-42",
73+
"session_id": "sess-abc",
74+
# optional per-call limit overrides (spend_limit evaluator only)
75+
"channel_max_per_transaction": 50.0,
76+
"channel_max_per_period": 200.0,
77+
}
78+
```
79+
80+
Result convention: `matched=True` means the transaction **violates** the policy.
81+
82+
## Context-Aware Limits
83+
84+
The spend limit evaluator supports channel-specific overrides via the data dict, implementing per-context policies without separate evaluator instances:
85+
86+
```python
87+
# Base policy: $1000/day max, $100/tx max
88+
# But the "experimental" channel gets tighter limits:
89+
result = await evaluator.evaluate({
90+
"amount": 75.0,
91+
"currency": "USDC",
92+
"recipient": "0xABC",
93+
"channel": "experimental",
94+
"channel_max_per_transaction": 50.0, # override: $50 cap for this channel
95+
"channel_max_per_period": 200.0, # override: $200/day for this channel
96+
})
97+
```
98+
99+
## Custom SpendStore
100+
101+
The `SpendStore` protocol requires two methods. Implement them for your backend:
102+
103+
```python
104+
from agent_control_evaluator_financial_governance.spend_limit import (
105+
SpendStore,
106+
SpendLimitConfig,
107+
SpendLimitEvaluator,
108+
)
109+
110+
class PostgresSpendStore:
111+
"""Example: PostgreSQL-backed spend tracking."""
112+
113+
def __init__(self, connection_string: str):
114+
self._conn = connect(connection_string)
115+
116+
def record_spend(self, amount: float, currency: str, metadata: dict | None = None) -> None:
117+
self._conn.execute(
118+
"INSERT INTO agent_spend (amount, currency, metadata, recorded_at) VALUES (%s, %s, %s, NOW())",
119+
(amount, currency, json.dumps(metadata)),
120+
)
121+
122+
def get_spend(self, currency: str, since_timestamp: float) -> float:
123+
row = self._conn.execute(
124+
"SELECT COALESCE(SUM(amount), 0) FROM agent_spend WHERE currency = %s AND recorded_at >= to_timestamp(%s)",
125+
(currency, since_timestamp),
126+
).fetchone()
127+
return float(row[0])
128+
129+
# Use it:
130+
store = PostgresSpendStore("postgresql://...")
131+
evaluator = SpendLimitEvaluator(config, store=store)
132+
```
133+
134+
## Running Tests
135+
136+
```bash
137+
cd evaluators/contrib/financial-governance
138+
pip install -e ".[dev]"
139+
pytest tests/ -v
140+
```
141+
142+
## Design Decisions
143+
144+
1. **Decoupled from data source** — The `SpendStore` protocol means no new tables in core Agent Control. Bring your own persistence.
145+
2. **Context-aware limits** — Override keys in the evaluate data dict allow per-channel, per-agent, or per-session limits without multiple evaluator instances.
146+
3. **Python SDK compatible** — Uses the standard evaluator interface; works with both the server and the Python SDK evaluation engine.
147+
4. **Fail-open on errors** — Missing or malformed data returns `matched=False` with an `error` field, following Agent Control conventions.
148+
149+
## Related Projects
150+
151+
- [x402](https://github.com/coinbase/x402) — HTTP 402 payment protocol
152+
- [agentpay-mcp](https://github.com/up2itnow0822/agentpay-mcp) — MCP server for non-custodial agent payments
153+
154+
## License
155+
156+
Apache-2.0 — see [LICENSE](../../../LICENSE).
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
[project]
2+
name = "agent-control-evaluator-financial-governance"
3+
version = "0.1.0"
4+
description = "Financial governance evaluators for agent-control — spend limits and transaction policy enforcement"
5+
readme = "README.md"
6+
requires-python = ">=3.12"
7+
license = { text = "Apache-2.0" }
8+
authors = [{ name = "agent-control contributors" }]
9+
keywords = ["agent-control", "evaluator", "financial", "spend-limit", "x402", "agentpay"]
10+
classifiers = [
11+
"Development Status :: 4 - Beta",
12+
"Intended Audience :: Developers",
13+
"License :: OSI Approved :: Apache Software License",
14+
"Programming Language :: Python :: 3",
15+
"Programming Language :: Python :: 3.12",
16+
"Topic :: Software Development :: Libraries",
17+
]
18+
dependencies = [
19+
"agent-control-evaluators>=3.0.0",
20+
"agent-control-models>=3.0.0",
21+
]
22+
23+
[project.optional-dependencies]
24+
dev = [
25+
"pytest>=8.0.0",
26+
"pytest-asyncio>=0.23.0",
27+
"pytest-cov>=4.0.0",
28+
"ruff>=0.1.0",
29+
"mypy>=1.8.0",
30+
]
31+
32+
[project.entry-points."agent_control.evaluators"]
33+
"financial_governance.spend_limit" = "agent_control_evaluator_financial_governance.spend_limit:SpendLimitEvaluator"
34+
"financial_governance.transaction_policy" = "agent_control_evaluator_financial_governance.transaction_policy:TransactionPolicyEvaluator"
35+
36+
[build-system]
37+
requires = ["hatchling"]
38+
build-backend = "hatchling.build"
39+
40+
[tool.hatch.build.targets.wheel]
41+
packages = ["src/agent_control_evaluator_financial_governance"]
42+
43+
[tool.ruff]
44+
line-length = 100
45+
target-version = "py312"
46+
47+
[tool.ruff.lint]
48+
select = ["E", "F", "I"]
49+
50+
[tool.pytest.ini_options]
51+
asyncio_mode = "auto"
52+
53+
[tool.uv.sources]
54+
agent-control-evaluators = { path = "../../builtin", editable = true }
55+
agent-control-models = { path = "../../../models", editable = true }
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
"""Financial governance evaluators for agent-control.
2+
3+
Provides two evaluators for enforcing financial policy on AI agent transactions:
4+
5+
- ``financial_governance.spend_limit``: Tracks cumulative spend against rolling
6+
period budgets and per-transaction caps.
7+
- ``financial_governance.transaction_policy``: Static policy checks — allowlists,
8+
blocklists, amount bounds, and permitted currencies.
9+
10+
Both evaluators are registered automatically when this package is installed and
11+
the ``agent_control.evaluators`` entry point group is discovered.
12+
13+
Example usage in an agent-control control config::
14+
15+
{
16+
"condition": {
17+
"selector": {"path": "*"},
18+
"evaluator": {
19+
"name": "financial_governance.spend_limit",
20+
"config": {
21+
"max_per_transaction": 100.0,
22+
"max_per_period": 1000.0,
23+
"period_seconds": 86400,
24+
"currency": "USDC"
25+
}
26+
}
27+
},
28+
"action": {"decision": "deny"}
29+
}
30+
"""
31+
32+
from agent_control_evaluator_financial_governance.spend_limit import (
33+
SpendLimitConfig,
34+
SpendLimitEvaluator,
35+
)
36+
from agent_control_evaluator_financial_governance.transaction_policy import (
37+
TransactionPolicyConfig,
38+
TransactionPolicyEvaluator,
39+
)
40+
41+
__all__ = [
42+
"SpendLimitEvaluator",
43+
"SpendLimitConfig",
44+
"TransactionPolicyEvaluator",
45+
"TransactionPolicyConfig",
46+
]
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
"""Spend-limit evaluator package."""
2+
3+
from .config import SpendLimitConfig
4+
from .evaluator import SpendLimitEvaluator
5+
from .store import InMemorySpendStore, SpendStore
6+
7+
__all__ = [
8+
"SpendLimitEvaluator",
9+
"SpendLimitConfig",
10+
"SpendStore",
11+
"InMemorySpendStore",
12+
]
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
"""Configuration model for the spend-limit evaluator."""
2+
3+
from __future__ import annotations
4+
5+
from pydantic import Field, field_validator
6+
7+
from agent_control_evaluators import EvaluatorConfig
8+
9+
10+
class SpendLimitConfig(EvaluatorConfig):
11+
"""Configuration for :class:`~.evaluator.SpendLimitEvaluator`.
12+
13+
All monetary fields are expressed in the units of *currency*.
14+
15+
Attributes:
16+
max_per_transaction: Hard cap on any single transaction amount. A
17+
transaction whose ``amount`` exceeds this value is blocked
18+
regardless of accumulated period spend. Set to ``0.0`` to disable.
19+
max_per_period: Maximum total spend allowed within the rolling
20+
*period_seconds* window. Set to ``0.0`` to disable.
21+
period_seconds: Length of the rolling budget window in seconds.
22+
Defaults to ``86400`` (24 hours).
23+
currency: Currency symbol this policy applies to (e.g. ``"USDC"``).
24+
Transactions whose currency does not match are passed through as
25+
*not matched* (i.e. allowed).
26+
27+
Example config dict::
28+
29+
{
30+
"max_per_transaction": 500.0,
31+
"max_per_period": 5000.0,
32+
"period_seconds": 86400,
33+
"currency": "USDC"
34+
}
35+
"""
36+
37+
max_per_transaction: float = Field(
38+
default=0.0,
39+
ge=0.0,
40+
description=(
41+
"Per-transaction spend cap in *currency* units. "
42+
"0.0 means no per-transaction limit."
43+
),
44+
)
45+
max_per_period: float = Field(
46+
default=0.0,
47+
ge=0.0,
48+
description=(
49+
"Maximum cumulative spend allowed in the rolling period window. "
50+
"0.0 means no period limit."
51+
),
52+
)
53+
period_seconds: int = Field(
54+
default=86_400,
55+
ge=1,
56+
description="Rolling budget window length in seconds (default: 86400 = 24 h).",
57+
)
58+
currency: str = Field(
59+
...,
60+
min_length=1,
61+
description="Currency symbol this policy applies to (e.g. 'USDC', 'ETH').",
62+
)
63+
64+
@field_validator("currency")
65+
@classmethod
66+
def normalize_currency(cls, v: str) -> str:
67+
"""Normalize currency symbol to upper-case for consistent comparison."""
68+
return v.upper()

0 commit comments

Comments
 (0)