AISBench
Popular repositories Loading
-
-
mini-swe-agent
mini-swe-agent PublicForked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
Python
-
terminal-bench-2
terminal-bench-2 PublicForked from harbor-framework/terminal-bench-2
Preset all environment in docker images
Shell
Repositories
- terminal-bench-2 Public Forked from harbor-framework/terminal-bench-2
Preset all environment in docker images
AISBench/terminal-bench-2’s past year of commit activity - benchmark Public
AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support for service-based models.
AISBench/benchmark’s past year of commit activity - mini-swe-agent Public Forked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
AISBench/mini-swe-agent’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…