Skip to content

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

License

Notifications You must be signed in to change notification settings

stovecat/mini-swe-agent

 
 

mini-swe-agent

A minimal fork of SWE-agent/mini-swe-agent, customized for local evaluation with vLLM.


🚀 Quick Setup

1. Install Miniforge

Miniforge is a lightweight Conda installer.

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

See the official Miniforge repo for more details.


2. Clone Repository

git clone https://github.com/stovecat/mini-swe-agent.git
cd mini-swe-agent

3. Create Conda Environment

# modify `prefix` accordingly to reflect your working directory.
vi environment.yml
conda env create -f environment.yml
conda activate swe

4. SWE-bench CLI Setup

Install and configure the SWE-bench CLI:

pip install sb-cli
sb-cli gen-api-key your.email@example.com

Then export your API key and verify it:

export SWEBENCH_API_KEY=your_api_key
sb-cli verify-api-key YOUR_VERIFICATION_CODE

See SWE-bench CLI docs for reference.


5. Configure Environment Variables

Edit your ~/.bashrc (or ~/.zshrc) and add:

export HF_HOME=[YOUR_CACHE_PATH]/.cache/huggingface
export SWEBENCH_API_KEY=[YOUR_SWE_API_KEY]
source ~/.bashrc
conda activate swe

6. Launch vLLM

Run the model initialization script (modify variables as needed):

vi scripts/init_run_vllm_model.sh

Required environment variables to modify:

  • cuda device
  • max-model-len
  • tensor-parallel-size
  • port number
  • MODEL_NAME (your vLLM service name)
  • HF_MODEL_NAME (full Hugging Face model name, including org prefix)

The working directory should be set so that HF_HOME in the docker is visible, e.g., if HF_HOME=/mnt/sda/hojae/.cache/huggingface in local, ... -v /mnt/sda/hojae:/workspace \ -e HF_HOME=/workspace/.cache/huggingface ...

If you are using OpenAI models like gpt-oss-120b, ensure that tiktoken_cache is stored under .cache.

bash scripts/init_run_vllm_model.sh

When the model loads successfully, stop it and create your run script (e.g. based on scripts/run_vllm_gpt-oss-120b.sh).

⚠️ Important: Use HF_HUB_OFFLINE=1 (local mode) to ensure greedy decoding and temperature settings apply correctly.

Your MODEL_NAME must have a corresponding entry in src/minisweagent/config/model_prices_and_context_window.json.

cp scripts/run_vllm_gpt-oss-120b.sh scripts/run_vllm_[MODEL_NAME].sh
vi scripts/run_vllm_[MODEL_NAME].sh
bash scripts/run_vllm_[MODEL_NAME].sh

7. Prepare Model YAML

Use src/minisweagent/config/extra/vllm_gpt-oss-120b_swebench.yaml as a template.

Set:

model_name: [MODEL_NAME]
litellm_model_registry: [ABSOLUTE_PATH_TO `model_prices_and_context_window.json`]
port: [SAME_PORT_NUMBER_AS_IN_vLLM_LAUNCH_SCRIPT]

8. Run Evaluation

Edit and execute the evaluation script:

# Modify BASE_DIR and MODEL_NAME
vi scripts/eval_vllm_swebench.sh
bash scripts/eval_vllm_swebench.sh

🧩 Notes

  • MODEL_NAME must match across:

    • vLLM launch script
    • YAML configuration
    • Evaluation script
    • model_prices_and_context_window.json

🧪 Results

Model % Resolved Version
Reported
gpt-oss-120b 26.00 1.7.0
Llama-4-Scout-17B-16E 9.06 0.0.0
Qwen2.5-Coder-32B-Instruct 9.00 1.0.0
Reproduced
gpt-oss-120b 30.00 1.14.2
Qwen3-30B-A3B-Instruct-2507 11.00 1.14.2

📚 References

About

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Shell 1.6%