Open-source adversarial robustness benchmarking for every AI model.
Submit your model. Survive FGSM, PGD, prompt injection, jailbreaks, and 15+ more industry-standard attacks. Climb the public leaderboard. Contribute new attacks via GitHub PR.
Live Site · Leaderboard · Attacks · Contributing
Model Arena is a community-driven benchmarking platform where:
- Developers submit AI models (image classifiers, LLMs, object detectors, tabular/audio models) to be evaluated against adversarial attacks
- Researchers contribute novel attacks via GitHub PRs — once merged, they run against every model on the platform
- Everyone can see the public leaderboard and full evaluation results
The platform includes an AutoResearch agent that continuously scans ArXiv, uses Claude to synthesize novel attacks from recent papers, and auto-submits promising ones for maintainer review.
| Category | Weight | Attacks |
|---|---|---|
| Gradient-Based | 30% | FGSM, PGD, AutoAttack (APGD-CE+DLR+FAB+Square), C&W L2, DeepFool |
| Black-Box | 25% | Square Attack, HopSkipJump, Adversarial Patch |
| Prompt Injection | 25% | Direct injection, GCG, Cipher Attacks (8 variants) |
| Jailbreak | 20% | Jailbreak Templates, PAIR, TAP, AutoDAN, Many-Shot |
| Multimodal | — | Typographic Attack |
| Tabular | — | Tabular PGD |
| Audio | — | Universal Audio Perturbation |
Community attacks are added via Pull Request (see Contributing).
| Layer | Technology |
|---|---|
| Backend | FastAPI · SQLAlchemy (async) · PostgreSQL · Celery · Redis |
| Frontend | Next.js 14 · TailwindCSS · TanStack Query · Recharts · WebSockets |
| AI | Anthropic Claude (PAIR attacker, TAP evaluator, AutoResearch agent) |
| Infra | Docker Compose (Postgres + Redis) |
- Python 3.11+
- Node.js 18+
- PostgreSQL 14+
- Redis 7+
cd backend
# Create virtual environment
python -m venv .venv && source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
pip install "pydantic[email]"
# Configure environment
cp .env.example .env
# Edit .env — add ANTHROPIC_API_KEY for LLM attacks + AutoResearch
# Create database
psql -c "CREATE USER arena WITH PASSWORD 'arena';"
psql -c "CREATE DATABASE model_arena OWNER arena;"
# Run migrations
alembic upgrade head
# Seed built-in attacks
python scripts/seed.py
# Start API server
uvicorn app.main:app --reload --port 8000
# Start Celery worker (new terminal)
celery -A app.workers.celery_app worker --loglevel=infocd frontend
npm install --legacy-peer-deps
npm run devOpen http://localhost:3000.
| Variable | Required | Description |
|---|---|---|
DATABASE_URL |
Yes | PostgreSQL async URL |
REDIS_URL |
Yes | Redis URL |
SECRET_KEY |
Yes | JWT signing key |
ANTHROPIC_API_KEY |
Optional | Enables PAIR, TAP, AutoResearch |
Model Arena is open source and welcomes contributions of all kinds.
See CONTRIBUTING.md for the full guide. The short version:
- Fork this repo
- Copy
community_attacks/_template.py→community_attacks/your_attack.py - Implement the
BaseAttackinterface - Open a Pull Request — CI validates automatically
- Bug reports → GitHub Issues
- Feature requests → GitHub Discussions
- Frontend / backend improvements → PR against
main