An open-source context management platform for LLM-powered agents
LLM agents experience context degradation well before hitting token limits. Quality drops significantly after ~50 tool calls, typically around 60-70% context capacityβa phenomenon we call "pre-rot".
Quality β
100% βββββββββββββββββββββββββ
β ββββββββ
80% β ββββββββ
β ββββ
60% β ββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
0% 25% 50% 65% 80% 100%
Token Usage
ββββββ Safe βββββΊββ Pre-Rot βΊββββ Degraded βββΊβ
ContextEngine treats context as a first-class data structure (graph-based, not flat strings) and applies intelligent compression through a tiered strategy:
- Lossless (100% recoverable): Externalize payloads, deduplicate, collapse tool chains
- Compaction (80-95% recoverable): Schema compression, entity-centric filtering
- Summarization (last resort): Hierarchical, task-aware, and incremental summarization
- Graph-based Context: Typed nodes (messages, tool calls, artifacts) with relationships
- Smart Compression: 10-20x compression with minimal information loss
- Pre-rot Detection: Act before quality degrades, not after
- Entity Tracking: NER-powered entity extraction and importance scoring
- Semantic Search: Embedding-based similarity and duplicate detection
- Recovery Manifests: Track all compression operations for potential rollback
- Tiered Storage: Hot/warm/cold storage with automatic migration policies
- Tool Caching: Semantic and exact-match caching with pattern detection
- Predictive Prefetch: Learn tool patterns and prefetch likely next calls
- Observable: OpenTelemetry tracing and Prometheus metrics built-in
# Using uv (recommended)
uv add context-core context-compression context-memory context-tools
# Or with pip
pip install context-core context-compression context-memory context-toolsfrom context_core import ContextGraph, TokenBudget, EntityTracker, SemanticIndex
from context_compression import CompressionPipeline, CompressionTier
# Create a context graph
graph = ContextGraph(session_id="my-session")
# Add messages and tool calls
graph.add_message(role="user", content="Find all Python files in the src directory")
call = graph.add_tool_call("glob", {"pattern": "src/**/*.py"})
graph.add_tool_result(call.id, ["src/main.py", "src/utils.py", "src/config.py"])
# Track token budget with pre-rot detection
budget = TokenBudget(total_tokens=100_000)
budget.allocate("context", graph.total_tokens)
if budget.status.needs_compression:
# Apply intelligent compression
pipeline = CompressionPipeline()
results = pipeline.compress(graph, max_tier=CompressionTier.COMPACTION)
print(f"Saved {sum(r.tokens_saved for r in results)} tokens")βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ContextEngine β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β βcontext-core β β context- β β context- β β
β β β β compression β β observe β β
β β β’ Graph β β β β β β
β β β’ Entities β β β’ Pipeline β β β’ Tracing β β
β β β’ Semantic β β β’ Lossless β β β’ Metrics β β
β β β’ Budget β β β’ Compactionβ β β’ Events β β
β β β’ Tokenizer β β β’ Summary β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β context- β β context- β β context- β β
β β memory β β tools β β multiagent β (Phase 4) β
β β β β β β β β
β β β’ Tiered β β β’ Caching β β β’ Broker β β
β β β’ Working β β β’ Patterns β β β’ Handoff β β
β β β’ Retrieval β β β’ Prefetch β β β’ Sync β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Package | Description | Status | Tests |
|---|---|---|---|
context-core |
Graph, entities, semantic index, token budget | β Complete | 358 |
context-compression |
Compression pipeline with 9 strategies | β Complete | 311 |
context-observe |
OpenTelemetry tracing, Prometheus metrics | β Complete | - |
context-memory |
Storage backends, tiered storage, retrieval | β Complete | 307 |
context-tools |
Tool caching, patterns, prefetching | β Complete | 283 |
context-multiagent |
Broker, handoff, shared memory | π Phase 4 | - |
| Strategy | Description | Compression |
|---|---|---|
ExternalizePayloads |
Move large outputs to external storage | 2-5x |
DeduplicateSemantically |
Remove near-duplicate content | 1.5-2x |
CollapseToolChains |
Merge sequential related tool calls | 2-3x |
| Strategy | Description | Compression |
|---|---|---|
SchemaCompression |
Extract and reference repeated JSON schemas | 2-4x |
EntityCentricCompression |
Keep only entity-relevant sentences | 2-3x |
TaskRelevanceCompression |
Filter by current task relevance | 2-4x |
| Strategy | Description | Compression |
|---|---|---|
HierarchicalSummarization |
Bottom-up multi-level summaries | 5-10x |
TaskAwareSummarization |
Task-focused with relevance scoring | 5-10x |
IncrementalSummarization |
Streaming updates to running summary | 5-10x |
# Clone the repository
git clone https://github.com/Sean-Koval/context-engineering.git
cd context-engineering
# Install dependencies
uv sync
# Install all packages in development mode
uv pip install -e packages/context-core -e packages/context-compression -e packages/context-memory -e packages/context-tools -e packages/context-observe
# Run tests
uv run pytest
# Format and lint
uv run ruff format .
uv run ruff check --fix .
# Type check
uv run ty check .context-engineering/
βββ packages/
β βββ context-core/ # Foundation package (358 tests)
β β βββ src/context_core/
β β β βββ graph/ # ContextGraph, nodes, edges
β β β βββ entities/ # EntityTracker, NER backends
β β β βββ semantic/ # SemanticIndex, vector stores
β β β βββ budget/ # TokenBudget, pre-rot detection
β β β βββ tokenizer/ # Tokenizer protocol, implementations
β β βββ tests/
β βββ context-compression/ # Compression pipeline (311 tests)
β β βββ src/context_compression/
β β β βββ strategies/
β β β β βββ lossless/ # Externalize, deduplicate, collapse
β β β β βββ compaction/ # Schema, entity-centric, task
β β β β βββ summarization/ # Hierarchical, task-aware, incremental
β β β βββ recovery/ # Manifest, operations
β β β βββ pipeline.py # CompressionPipeline orchestrator
β β βββ tests/
β βββ context-memory/ # Persistent storage (307 tests)
β β βββ src/context_memory/
β β β βββ backends/ # FileSystem, SQLite, Postgres, Redis
β β β βββ retrieval/ # Semantic, Entity, Temporal, Ensemble
β β β βββ artifacts/ # Versioned artifact management
β β β βββ tiered.py # Hot/warm/cold tiered storage
β β β βββ working.py # Working memory with LRU cache
β β β βββ eviction.py # Multi-tier eviction strategies
β β βββ tests/
β βββ context-tools/ # Tool optimization (283 tests)
β β βββ src/context_tools/
β β β βββ cache/ # ToolCallCache, semantic matching
β β β βββ patterns/ # ToolUsagePatterns, antipattern detection
β β β βββ compression/ # ToolResultCompressor, schema extraction
β β β βββ prefetch/ # ToolPrefetcher, argument prediction
β β βββ tests/
β βββ context-observe/ # Observability
β βββ src/context_observe/
β β βββ tracer.py # OpenTelemetry integration
β β βββ metrics.py # Prometheus metrics
β β βββ events.py # Structured logging
β βββ tests/
βββ specs/ # Technical specifications
βββ docs/ # Research and analysis
βββ INDEX.md # Implementation progress tracking
βββ TASK_BOARD.md # Granular task breakdown
βββ MASTER_ROADMAP.md # Vision and architecture
| Phase | Focus | Status |
|---|---|---|
| Phase 1 | Foundation (Graph, Entities, Semantic, Budget) | β Complete |
| Phase 2 | Compression (Pipeline, 9 Strategies, Recovery) | β Complete |
| Phase 3 | Memory & Tools (Storage, Caching, Patterns) | β Complete |
| Phase 4 | Multi-Agent (Broker, Handoff, Sync) | π Planned |
| Metric | Target | Current |
|---|---|---|
| Context utilization before degradation | 90%+ | β |
| Reversible compression ratio | 3-5x | β |
| Total compression ratio | 10-20x | β |
| Test coverage | 90%+ | 1,259 tests |
| Memory retrieval p99 latency | < 100ms | β |
| Tool cache hit rate | > 60% | β |
We welcome contributions! See CONTRIBUTING.md for guidelines.
Each spec file in specs/ contains:
- Complete Python code with type hints
- Pydantic models for all data structures
- Implementation checklists
- Test specifications
Use TASK_BOARD.md for granular task breakdown with dependencies.
MIT License - see LICENSE for details.
- Built with uv for blazing fast package management
- Uses NetworkX for graph operations
- Embeddings powered by sentence-transformers
- NER via spaCy
ContextEngine - Because context is too valuable to waste.