Fail fast on oversized plain LM prompts and clarify llm_query semantics by taivu1998 · Pull Request #143 · alexzhang13/rlm

taivu1998 · 2026-04-01T23:08:56Z

Summary

This PR adds a fail-fast context-window guard for plain LM calls and clarifies when models should use llm_query(...) versus rlm_query(...).

Closes #42.

Problem

Issue #42 reports context-window failures when large prompts are sent through sub-calls. After reviewing the code path, the main practical gap was not child RLM prompt inheritance, but plain llm_query(...) and leaf LM calls: they could send oversized prompts directly to provider SDKs with no preflight validation, which led to provider-specific failures and inconsistent error messages.

What Changed

Added ContextWindowExceededError for oversized prompt validation failures.
Added shared prompt-fit utilities in rlm.utils.token_utils:
- estimate_text_tokens(...)
- count_prompt_tokens(...)
- validate_prompt_fits_context_window(...)
Enforced prompt-size validation before provider SDK calls in all built-in LM clients:
- OpenAI
- Anthropic
- Gemini
- Azure OpenAI
- Portkey
Reused the existing LMHandler and REPL error propagation path so validation errors surface cleanly inside llm_query(...) / rlm_query(...) fallback behavior.
Re-exported ContextWindowExceededError from the top-level rlm package.
Updated system prompt and docs to clarify:
- llm_query(...) is for prompts that already fit the target model's context window.
- rlm_query(...) is the deeper/offloaded path where recursive subcalls are available.

Design Notes

This keeps the fix intentionally narrow and low-complexity.
There is no auto-chunking, auto-routing, or fallback summarization in this PR.
Existing batch semantics are preserved.
The validation is shared and provider-agnostic, while still using the repo's existing model limit table and token estimation helpers.

Tests

Added and updated tests for:

token estimation and validation helpers
client-side preflight validation before SDK calls
LMHandler propagation of context-window failures
LocalREPL / rlm_query(...) fallback error surfacing
max-depth plain LM fallback behavior in _subcall

Verification

UV_CACHE_DIR=/tmp/uv-cache uv run pytest -q

Result:

292 passed, 7 skipped

taivu1998 · 2026-04-01T23:11:18Z

Hi @alexzhang13, could you help review it when you have time? Thanks!

taivu1998 · 2026-04-19T15:42:15Z

Hi @alexzhang13, could you help review it when you have time? Thanks!

Add context-window validation for plain LM calls

9c315aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail fast on oversized plain LM prompts and clarify llm_query semantics#143

Fail fast on oversized plain LM prompts and clarify llm_query semantics#143
taivu1998 wants to merge 1 commit into
alexzhang13:mainfrom
taivu1998:codex/issue-42-context-window-guard

taivu1998 commented Apr 1, 2026

Uh oh!

taivu1998 commented Apr 1, 2026

Uh oh!

taivu1998 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

taivu1998 commented Apr 1, 2026

Summary

Problem

What Changed

Design Notes

Tests

Verification

Uh oh!

taivu1998 commented Apr 1, 2026

Uh oh!

taivu1998 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant