Experimental agent IDE and runtime for coding tasks, built with LangGraph, MySQL, Redis, a web workbench, and benchmark tooling for SWE-bench Lite.
This repository is not a finished product. It is an active engineering sandbox focused on:
- a persisted agent runtime with LangGraph orchestration
- a browser-based IDE shell for sessions, files, and terminal actions
- MiniMax and Claude provider adapters
- a headless SWE-bench Lite runner and official harness workflow
The current state is best described as: implemented in parts, validated in parts, still incomplete as a full agent IDE.
Useful project status documents:
apps/ide-webMinimal web IDE shell, provider hooks, browser endpoints, and dev server.packages/coreCore domain entities such as sessions, goals, plans, tasks, memory, and tool policy.packages/runtimeLangGraph workflow contracts, runtime orchestration, and application services.packages/dbMySQL repositories, Redis-backed caches, bootstrap, and checkpoint persistence.packages/toolsBuilt-in file and shell-oriented tools used by the runtime.packages/evalsSmoke scripts, benchmark runners, and SWE-bench Lite export helpers.docsArchitecture, reading order, deep-dive notes, benchmark instructions, and interview notes.
- Node.js
pnpm10.x- Docker Desktop
- Python 3 for the SWE-bench export script
pnpm installcp .env.example .envAt minimum, configure:
- MySQL / Redis connection values
- one LLM provider
MINIMAX_API_KEY,MINIMAX_BASE_URL,MINIMAX_MODEL- or the Claude env vars if you are using the Anthropic-compatible path
pnpm infra:upThe repository ships with:
- MySQL 8.4
- Redis 7
Compose file:
pnpm dev:ide-webThe dev server listens on 127.0.0.1:3440 by default.
Persistence smoke:
pnpm smoke:persistenceMiniMax smoke:
pnpm smoke:minimaxType check:
pnpm typecheckThis repository includes a headless runner that produces predictions.json for the official SWE-bench harness.
Primary files:
- packages/evals/src/swebench-lite.ts
- packages/evals/scripts/export_swebench_lite_subset.py
- docs/swebench-lite.md
Export a small instance set:
pnpm bench:swebench:export --count 5 --output .benchmarks/swebench-lite/instances.jsonRun the local Lite runner:
LLM_PROVIDER=minimax pnpm bench:swebench:lite -- \
--instances-file .benchmarks/swebench-lite/instances.json \
--run-id swebench-lite-manual-1Then evaluate the generated predictions in the official SWE-bench repository:
python -m swebench.harness.run_evaluation \
--dataset_name princeton-nlp/SWE-bench_Lite \
--predictions_path /path/to/opencode/.benchmarks/swebench-lite/runs/swebench-lite-manual-1/predictions.json \
--instance_ids <instance ids...> \
--max_workers 1 \
--run_id swebench-lite-manual-1 \
--namespace ''Start here if you want to understand the codebase instead of only running it:
Experimental agent IDE and LangGraph runtime with MySQL/Redis persistence, MiniMax/Claude providers, a web workbench, and SWE-bench Lite tooling.