Summary
Extend autonomous-dev's context management to support non-Claude backends (GPT-4, Gemini, Ollama, vLLM, etc.) by adding configurable context window sizes, custom compaction strategies, and checkpoint/resume integration around context operations.
What Does NOT Work
Pattern: Hardcoded Claude-specific limits fail for other backends
Current system assumes Claude's 200K token context window
GPT-4 has 128K limit, Gemini 1.5 has 1M limit, Llama-3.1 has 128K, Mixtral has 32K
No mechanism to configure context window size per backend
Session state manager has no awareness of backend-specific constraints
Pattern: Auto-compaction assumption fails for non-Claude backends
Pattern: One-size-fits-all compaction fails for different backends
Claude handles compaction automatically and effectively
Other backends may need explicit summarization logic
No pluggable compaction strategy mechanism
Cannot customize compaction per backend requirements
Scenarios
Fresh Install
What happens : User installs autonomous-dev with default configuration
Uses Claude backend (default)
Context window = 200K tokens (Claude default)
Auto-compaction managed by Claude Code automatically
No custom compaction logic needed
Expected behavior :
No configuration required for Claude users
Default .env values work out of the box
Backward compatible with existing workflows
Update/Upgrade - Valid Existing Data
What happens : User has existing .env with Claude configuration
Upgrade to multi-backend support version
Existing .env has no CONTEXT_WINDOW_SIZE variable
Expected behavior :
Auto-detect backend from .env (if BACKEND=claude or not set)
Set context window to Claude default (200K)
Preserve existing configuration
No user prompts needed
Update/Upgrade - Non-Claude Backend
What happens : User switches to GPT-4 backend
Sets BACKEND=openai in .env
No CONTEXT_WINDOW_SIZE specified
Expected behavior :
Auto-detect GPT-4 context limit (128K)
Enable custom compaction (CUSTOM_CONTEXT_COMPACTION=true)
Create checkpoint before first compaction
Use Anchored Iterative Summarization strategy
Log warning about reduced context capacity vs Claude
Update/Upgrade - Ollama Self-Hosted
What happens : User runs Ollama locally with Llama-3.1-70B
Sets BACKEND=ollama in .env
Custom CONTEXT_WINDOW_SIZE=8192 (model-specific limit)
Expected behavior :
Respect custom window size
Trigger compaction at 85% (configurable COMPACTION_THRESHOLD_PCT)
Use summarization strategy appropriate for smaller windows
Create checkpoint before compaction
Resume seamlessly after compaction
Update/Upgrade - User Customizations
What happens : User has custom .env settings
Custom CONTEXT_WINDOW_SIZE=100000
Custom COMPACTION_THRESHOLD_PCT=90
Custom COMPACTION_STRATEGY=truncate
Expected behavior :
Never overwrite user's explicit settings
Respect custom context window size
Use custom thresholds and strategies
Log info about active custom configuration
Implementation Approach
1. Configuration Layer (.env variables)
File : plugins/autonomous-dev/.env.example
Add backend-specific context window configuration:
# Context Window Configuration (Issue #278)
CONTEXT_WINDOW_SIZE=200000 # Max tokens (default: Claude's 200K)
COMPACTION_THRESHOLD_PCT=85 # Trigger compaction at 85% full (170K for 200K window)
BACKEND=claude # Backend: claude, openai, gemini, ollama, custom
# Custom Compaction Strategy (for non-Claude backends)
CUSTOM_CONTEXT_COMPACTION=false # true = implement our own compaction, false = rely on Claude
COMPACTION_STRATEGY=auto # auto (backend default) | summarize | truncate | clustering
CHECKPOINT_BEFORE_COMPACTION=true # Create checkpoint before compaction (safety)
2. Context Window Manager Library
File : plugins/autonomous-dev/lib/context_window_manager.py (NEW)
Create new library that:
Reads backend configuration from .env
Provides context window size for current backend
Calculates when to trigger compaction (threshold percentage)
Detects whether custom compaction is needed
Exposes API for session state manager integration
Key functions :
def get_context_window_size () -> int :
"""Get configured or auto-detected context window size."""
def get_compaction_threshold () -> int :
"""Calculate compaction trigger point (window_size * threshold_pct)."""
def should_trigger_compaction (current_tokens : int ) -> bool :
"""Check if current token count exceeds threshold."""
def get_backend_name () -> str :
"""Get configured backend name."""
def needs_custom_compaction () -> bool :
"""Check if CUSTOM_CONTEXT_COMPACTION=true."""
3. Custom Compaction Strategies
File : plugins/autonomous-dev/lib/compaction_strategies.py (NEW)
Implement pluggable compaction strategies:
Strategy 1: AutoCompactionStrategy (Claude default)
Delegates to Claude Code's native compaction
No action needed - Claude handles it
Just monitor threshold for logging
Strategy 2: SummarizeCompactionStrategy (GPT-4, Ollama)
Uses Anchored Iterative Summarization (Factory.ai pattern)
Maintains structured summary with sections
Only summarizes newly-truncated spans (not entire history)
Preserves recent N messages (e.g., last 20)
Uses separate model call (Haiku for speed/cost)
Strategy 3: TruncateCompactionStrategy (Fallback)
Removes oldest messages (FIFO)
Keeps recent N messages
Simple but loses historical context
Use when summarization not available
Strategy 4: ClusteringCompactionStrategy (Advanced)
Groups related memories into clusters
Merges similar context
6-8x compression ratio
Use for large-scale batch processing
Interface :
class CompactionStrategy (ABC ):
@abstractmethod
def compact (self , session_state : Dict [str , Any ], keep_recent : int = 20 ) -> Dict [str , Any ]:
"""Compact session state, return new state with reduced tokens."""
pass
4. Checkpoint Integration
File : Extend plugins/autonomous-dev/lib/session_state_manager.py
Add checkpoint/resume around compaction:
create_compaction_checkpoint(): Save state before compaction
restore_from_checkpoint(): Rollback if compaction fails
cleanup_checkpoint(): Remove checkpoint after success
Integration pattern (reuses Issue #276 /#277 infrastructure):
# Before compaction
checkpoint_id = manager .create_compaction_checkpoint ()
try :
compaction_strategy .compact (state )
manager .cleanup_checkpoint (checkpoint_id )
except Exception as e :
manager .restore_from_checkpoint (checkpoint_id )
raise
5. Batch State Manager Integration
File : Extend plugins/autonomous-dev/lib/batch_state_manager.py
Un-deprecate and enhance should_auto_clear():
Check CUSTOM_CONTEXT_COMPACTION environment variable
If false (Claude): Return False (let Claude handle it)
If true (custom backend): Use context_window_manager.should_trigger_compaction()
Create checkpoint before triggering compaction
Update from Issue #277 :
def should_auto_clear (
state : BatchState ,
checkpoint_callback : Optional [callable ] = None
) -> bool :
"""Check if context should be auto-cleared.
Issue #277: DEPRECATED for Claude backends (auto-compact handles it).
Issue #278: RE-ACTIVATED for custom backends (CUSTOM_CONTEXT_COMPACTION=true).
"""
# Check if custom compaction is enabled
if not context_window_manager .needs_custom_compaction ():
return False # Claude handles it automatically
# Custom backend - check threshold
needs_clear = context_window_manager .should_trigger_compaction (
state .context_token_estimate
)
# Invoke checkpoint callback before compaction
if needs_clear and checkpoint_callback :
checkpoint_callback ()
return needs_clear
6. SessionStart Hook Enhancement
File : Extend plugins/autonomous-dev/hooks/SessionStart-batch-recovery.sh
Update hook to handle BOTH Claude auto-compact AND custom compaction:
Detect source="compact" (Claude auto-compact) OR
Detect compaction checkpoint file (custom compaction)
Load checkpoint and resume batch workflow
Display appropriate message based on compaction type
Test Scenarios
Fresh Install (No Existing Data)
Test : Install with defaults (Claude)
Test : Install with GPT-4 backend
Test : Install with Ollama backend
Update with Valid Existing Data
Test : Upgrade from v3.45.0 (Claude only)
Test : Switch from Claude to GPT-4
Update with Invalid/Broken Data
Test : Invalid CONTEXT_WINDOW_SIZE in .env
Test : Missing .env file
Update with User Customizations
Test : Custom CONTEXT_WINDOW_SIZE=100000
Test : Custom compaction strategy
Rollback After Failure
Test : Compaction fails mid-operation
Test : Backend switch during batch processing
Compaction Strategies
Test : Summarize strategy with GPT-4
Test : Truncate strategy fallback
Test : Clustering strategy (advanced)
Acceptance Criteria
Fresh Install
Updates
Validation
Security
Compaction
Backend Integration
Backward Compatibility
Security Considerations
CWE-22 (Path Traversal) : Validate batch_id and checkpoint paths (reuse Issue fix(ralph): RALPH loop stops batch processing without persistence - missing checkpoint/resume mechanism #276 validation)
CWE-59 (Symlink Following) : File permissions validation (0o600 only)
CWE-117 (Log Injection) : Sanitize backend names and configuration values before logging
CWE-502 (Insecure Deserialization) : JSON-only serialization for checkpoints (no pickle)
Atomic writes : tempfile.mkstemp() → write → chmod(0o600) → replace() pattern
File locking : threading.RLock for concurrent access serialization
Backup recovery : Automatic fallback to .bak files on corruption
Environment Requirements
Python : 3.11+ (existing requirement)
Claude Code : 2.0+ with plugins (existing requirement)
Operating System : macOS, Linux (existing requirement)
Dependencies : No new dependencies (reuses existing libraries)
Optional : Haiku model access for summarization (cost-effective compaction)
Breaking Changes
should_auto_clear() Un-Deprecation
Issue #277 : Deprecated should_auto_clear() assuming Claude handles compaction
Issue #278 : Re-activates should_auto_clear() for custom backends
Migration :
Configuration Priority
New three-tier priority system:
Environment variables (.env)
PROJECT.md configuration
Backend-specific defaults
Example:
# .env takes precedence
CONTEXT_WINDOW_SIZE=150000 # Overrides backend default
# If not in .env, check PROJECT.md
# If not in PROJECT.md, use backend default (200K for Claude, 128K for GPT-4)
Source of Truth
Verified : Existing codebase patterns
Web Research Verified :
Anchored Iterative Summarization (Factory.ai, 2024)
Semantic Compression research (6-8x compression, arxiv 2023)
Claude API context window specs (200K standard, 1M beta)
GPT-4 specs (128K context window)
Date Verified : 2026-01-28
Research Cached : .claude/cache/research_278.json (for /implement reuse)
Related Issues :
Estimated Implementation Time : 4-6 hours (one focused development session)
Priority : HIGH - Enables Claude Code usage with custom backends (Ollama, vLLM, self-hosted models)
Summary
Extend autonomous-dev's context management to support non-Claude backends (GPT-4, Gemini, Ollama, vLLM, etc.) by adding configurable context window sizes, custom compaction strategies, and checkpoint/resume integration around context operations.
What Does NOT Work
Pattern: Hardcoded Claude-specific limits fail for other backends
Pattern: Auto-compaction assumption fails for non-Claude backends
Pattern: One-size-fits-all compaction fails for different backends
Scenarios
Fresh Install
What happens: User installs autonomous-dev with default configuration
Expected behavior:
.envvalues work out of the boxUpdate/Upgrade - Valid Existing Data
What happens: User has existing
.envwith Claude configuration.envhas noCONTEXT_WINDOW_SIZEvariableExpected behavior:
.env(ifBACKEND=claudeor not set)Update/Upgrade - Non-Claude Backend
What happens: User switches to GPT-4 backend
BACKEND=openaiin.envCONTEXT_WINDOW_SIZEspecifiedExpected behavior:
Update/Upgrade - Ollama Self-Hosted
What happens: User runs Ollama locally with Llama-3.1-70B
BACKEND=ollamain.envCONTEXT_WINDOW_SIZE=8192(model-specific limit)Expected behavior:
Update/Upgrade - User Customizations
What happens: User has custom
.envsettingsCONTEXT_WINDOW_SIZE=100000COMPACTION_THRESHOLD_PCT=90COMPACTION_STRATEGY=truncateExpected behavior:
Implementation Approach
1. Configuration Layer (.env variables)
File:
plugins/autonomous-dev/.env.exampleAdd backend-specific context window configuration:
2. Context Window Manager Library
File:
plugins/autonomous-dev/lib/context_window_manager.py(NEW)Create new library that:
.envKey functions:
3. Custom Compaction Strategies
File:
plugins/autonomous-dev/lib/compaction_strategies.py(NEW)Implement pluggable compaction strategies:
Strategy 1: AutoCompactionStrategy (Claude default)
Strategy 2: SummarizeCompactionStrategy (GPT-4, Ollama)
Strategy 3: TruncateCompactionStrategy (Fallback)
Strategy 4: ClusteringCompactionStrategy (Advanced)
Interface:
4. Checkpoint Integration
File: Extend
plugins/autonomous-dev/lib/session_state_manager.pyAdd checkpoint/resume around compaction:
create_compaction_checkpoint(): Save state before compactionrestore_from_checkpoint(): Rollback if compaction failscleanup_checkpoint(): Remove checkpoint after successIntegration pattern (reuses Issue #276/#277 infrastructure):
5. Batch State Manager Integration
File: Extend
plugins/autonomous-dev/lib/batch_state_manager.pyUn-deprecate and enhance should_auto_clear():
CUSTOM_CONTEXT_COMPACTIONenvironment variablecontext_window_manager.should_trigger_compaction()Update from Issue #277:
6. SessionStart Hook Enhancement
File: Extend
plugins/autonomous-dev/hooks/SessionStart-batch-recovery.shUpdate hook to handle BOTH Claude auto-compact AND custom compaction:
source="compact"(Claude auto-compact) ORTest Scenarios
Fresh Install (No Existing Data)
Test: Install with defaults (Claude)
.env.examplewith Claude defaultsTest: Install with GPT-4 backend
BACKEND=openaibefore first useTest: Install with Ollama backend
BACKEND=ollamaandCONTEXT_WINDOW_SIZE=8192Update with Valid Existing Data
Test: Upgrade from v3.45.0 (Claude only)
.envsettingsTest: Switch from Claude to GPT-4
BACKEND=openaiin.envUpdate with Invalid/Broken Data
Test: Invalid
CONTEXT_WINDOW_SIZEin.env.envas.env.bakTest: Missing
.envfile.envfrom.env.exampleUpdate with User Customizations
Test: Custom
CONTEXT_WINDOW_SIZE=100000Test: Custom compaction strategy
COMPACTION_STRATEGY=truncateRollback After Failure
Test: Compaction fails mid-operation
Test: Backend switch during batch processing
.envCompaction Strategies
Test: Summarize strategy with GPT-4
Test: Truncate strategy fallback
Test: Clustering strategy (advanced)
Acceptance Criteria
Fresh Install
.env.examplewith all backend configurations documentedBACKEND=openaisetCONTEXT_WINDOW_SIZEsettingUpdates
.envconfigurationsValidation
CONTEXT_WINDOW_SIZEis within backend limits (1K-2M tokens)Security
.envfile paths (no path traversal - CWE-22)Compaction
Backend Integration
.envBACKENDvariableBackward Compatibility
.envfiles work without modificationSecurity Considerations
Environment Requirements
Breaking Changes
should_auto_clear() Un-Deprecation
Issue #277: Deprecated should_auto_clear() assuming Claude handles compaction
Issue #278: Re-activates should_auto_clear() for custom backends
Migration:
.envConfiguration Priority
New three-tier priority system:
.env)Example:
Source of Truth
Verified: Existing codebase patterns
session_state_manager.py: Checkpoint/atomic write patterns (Issue Feature: Session state persistence in .claude/local/ #247)batch_state_manager.py: Resume/crash recovery patterns (Issue CRITICAL: /batch-implement context bloat - implement state-based auto-clearing for 50+ features #76)ralph_loop_manager.py: Checkpoint infrastructure (Issue fix(ralph): RALPH loop stops batch processing without persistence - missing checkpoint/resume mechanism #276/feat(ralph): Integrate checkpoint with Claude auto-compact lifecycle for seamless batch resumption #277)pool_config.py: Environment-based configuration patternsbatch_resume_helper.py: SessionStart hook integration (Issue feat(ralph): Integrate checkpoint with Claude auto-compact lifecycle for seamless batch resumption #277)Web Research Verified:
Date Verified: 2026-01-28
Research Cached:
.claude/cache/research_278.json(for /implement reuse)Related Issues:
Estimated Implementation Time: 4-6 hours (one focused development session)
Priority: HIGH - Enables Claude Code usage with custom backends (Ollama, vLLM, self-hosted models)