Commit 87aa0e8
v0.14.0-alpha omnimcode-apiproxy: substrate-rewriting api.anthropic.com proxy
New crate. Reverse-proxies api.anthropic.com /v1/messages POST. On each
request, walks messages[].content[] (string form, text-block array form,
and tool_result content form) and replaces any text larger than 4096 bytes
(configurable) with a tiny <omc:ref hash_str=... bytes=... preview=.../>
marker. Originals cache in the existing MemoryStore _apiproxy_cache
namespace, naturally deduped via the Axis 2 pool.
Injects an omc_proxy_expand_ref(hash_str) tool into the request's tools
array so the LLM has a documented expansion path (though the alpha doesn't
yet intercept the tool_use response — see README "Known limitations").
Smoke-tested with a mock upstream: 7,177-byte request rewritten to
1,081-byte upstream payload — 6.64× compression on a single 6.8KB block.
This is the v0.14 Option B from the design conversation. Honest scope
correction: the per-turn compression is real and measured, but end-to-end
LLM-token savings depend on the LLM resisting expand calls. Realistic
expectation: 30-60% reduction on tool-heavy long sessions, not the 10-50×
I overpromised earlier.
Threat model: localhost-only bind by default, never logs request bodies,
never reads/logs auth headers.
CLI:
omnimcode-apiproxy --bind 127.0.0.1:8088 --upstream https://api.anthropic.com
ANTHROPIC_API_URL=http://localhost:8088 claude # if such an env var existed
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent cdb056f commit 87aa0e8
5 files changed
Lines changed: 1438 additions & 48 deletions
0 commit comments