More Than a Proxy — Your AI Backend
Unify 15+ LLM providers behind one API key. Built-in session memory, knowledge base (RAG), team management, cost tracking, and health monitoring. Self-hosted with one Docker command.
中文文档 · Quick Start · Features · Architecture · Docs · License
Other AI gateways just forward requests. WebRouter is a full-stack AI backend.
Managing multiple AI providers is painful — scattered keys, no cost visibility, no failover, no memory, no team controls. WebRouter gives you a single control plane for all your LLM traffic — plus the application-layer features most gateways don't bother with.
- Tired of hardcoding provider URLs? → One gateway endpoint, auto-routed to the best provider
- Need your AI to remember conversations? → Built-in session memory, not a stateless proxy
- Want your AI to know your docs? → Built-in RAG knowledge base, no separate vector DB needed
- Worried about provider outages? → Automatic health checks, cooldowns, and failover
- No idea how much you're spending? → Per-model cost tracking, quotas, and billing reports
- Sharing API keys across the team? → Token management with per-member quotas and model whitelists
Set model: "auto" and WebRouter picks the optimal model based on request complexity — simple queries get fast/cheap models, complex reasoning gets powerful ones.
Automatic health checks with latency tracking. Dead providers enter cooldown; traffic shifts to healthy alternatives — no manual intervention needed.
Real-time cost accounting per model, per token, per team. Billing reports, quota management, and budget alerts built in.
Built-in desensitization engine strips PII (phone numbers, ID cards, emails) from requests before they reach upstream providers.
Invite team members, assign quotas, restrict model access. Each member gets a unique API key with scoped permissions.
The wr-proxy Go gateway handles request forwarding, retry with backoff, streaming, and metering — all with minimal latency overhead.
| Type | Description | Health | Latency | Cost |
|---|---|---|---|---|
direct |
Official APIs (OpenAI, Anthropic, Google...) | ✅ | ✅ | — |
aggregate |
Aggregator platforms (OhMyGPT, API2D...) | ✅ | ✅ | Manual |
litellm |
LiteLLM proxy | ✅ | ✅ | — |
custom |
Any OpenAI-compatible gateway | ✅ | ✅ | — |
Clients can use @recall or X-Recall-Session header to automatically recover and inject conversation history from the server — no manual context management needed.
Built-in enterprise-grade retrieval-augmented generation. Auto-captures conversations, extracts structured knowledge via LLM, and injects relevant context into every request.
Token compression, session compression, and dynamic content reordering reduce upstream token consumption and improve prompt cache hit rates automatically.
One-click export of environment variables and config for Claude Code, Codex, Cursor, Continue, and more.
Don't want to install anything? Try the live demo at demo.webrouter.tech (login: admin / admin123456).
- Python 3.8+
- Go 1.21+ (only if building wr-proxy from source; pre-built binaries included)
- 2 GB+ RAM
Linux / macOS:
git clone https://github.com/candywon/webrouter.git
cd webrouter
bash deploy/install.shWindows (PowerShell):
git clone https://github.com/candywon/webrouter.git
cd webrouter
powershell -ExecutionPolicy Bypass -File deploy\install.ps1The install script auto-detects your OS and architecture, sets up a virtual environment, installs dependencies, and generates start/stop scripts.
open http://localhost:5050
# Default login: admin / admin123- Go to Providers → + Add
- Select type
direct, paste your OpenAI base URL and API key - Click 🔍 Check to verify connectivity
- Your gateway is ready at
http://localhost:5051/v1/chat/completions
cd webrouter
docker compose -f deploy/docker-compose.yml up -dFull documentation is available at webrouter.tech/docs/:
| Guide | Topics |
|---|---|
| Getting Started | Quick Start, Installation, Deployment |
| Core Concepts | Architecture, Providers, Tokens & Teams |
| Smart Routing | Auto model selection, fallback strategies |
| Memory & Knowledge | Session Recall, Knowledge Base & RAG |
| Operations | Monitoring, Alerting, Billing, Desensitization |
| API Reference | Full API documentation |
┌─────────────┐ HTTP ┌─────────────────┐
│ Browser/CLI │ ───────────→ │ WebRouter │
│ │ ←──────────── │ (Flask) │
└─────────────┘ │ :5050 │
└──────┬──────────┘
│
┌──────▼──────┐
│ wr-proxy │
│ (Go) :5051 │
└──────┬──────┘
│
┌───────────────┼───────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ direct │ │ aggregate │ │ custom │
│ (Official) │ │ (Aggregator)│ │ (Gateway) │
└─────────────┘ └─────────────┘ └─────────────┘
| Component | Stack | Role |
|---|---|---|
| WebRouter (backend) | Python Flask | Admin panel, REST API, database models, scheduler |
| wr-proxy | Go 1.22 | High-performance API proxy: routing, retry, desensitization, metering |
Both components share a SQLite database (MySQL/PostgreSQL also supported) for configuration and request logs.
webrouter/
├── backend/ # Flask backend
│ ├── app.py # Application factory
│ ├── config.py # Configuration
│ ├── models/ # Database models
│ ├── routes/ # 12 API blueprints (/api/*)
│ ├── services/ # Business logic
│ ├── static/ # Frontend SPA
│ │ ├── index.html
│ │ ├── js/ # 21 page modules
│ │ ├── css/
│ │ └── i18n/ # en.json, zh-CN.json
│ └── start.py # Process manager
├── wr-proxy/ # Go proxy gateway
│ ├── main.go
│ ├── proxy.go # HTTP forwarding
│ ├── smart_model.go # Smart routing
│ ├── retry.go # Retry with backoff
│ ├── desensitize.go # PII stripping
│ ├── meter.go # Cost tracking
│ └── ...
├── deploy/ # Deployment configs
│ ├── install.sh # Linux/macOS installer
│ ├── install.ps1 # Windows installer (PowerShell)
│ ├── Dockerfile
│ ├── docker-compose.yml
│ └── nginx.conf
├── docs/ # Documentation
├── data/ # Runtime data
└── .env # Environment config (auto-generated)
All settings are managed via the .env file (auto-generated on first install):
| Variable | Description | Default |
|---|---|---|
SESSION_SECRET |
Flask session key | Auto-generated |
DATABASE_URI |
Database connection string | SQLite |
REDIS_URL |
Redis connection (optional, for caching) | — |
FLASK_ENV |
Runtime environment | production |
FLASK_HOST |
Listen address | 0.0.0.0 |
FLASK_PORT |
Flask port | 5050 |
WR_PROXY_PORT |
wr-proxy port | 5051 |
ENABLE_SCHEDULER |
Run health checks & alerts on schedule | 0 (off in debug) |
python3 backend/start.py start # Start all services
python3 backend/start.py stop # Stop all services
python3 backend/start.py restart # Restart
python3 backend/start.py status # Check status
python3 backend/start.py logs # Tail logsOr use the generated shell scripts:
# Linux / macOS
./start.sh # Start
./stop.sh # Stop
# Windows (PowerShell)
.\start.ps1 # Start
.\stop.ps1 # Stop- Plugin SDK — Extensible plugin interface for EE modules
- SSO / SAML / OIDC — Enterprise single sign-on
- Audit logging — Tamper-proof operation audit trail
- Cluster mode — Multi-instance with shared state
- Cloud hosted version — Zero-ops managed service
- Advanced routing DSL — Custom routing rules by department, project, or tag
See LICENSING.md for the Community vs Enterprise edition feature matrix.
We welcome contributions! Before submitting a PR, please:
- Sign the Contributor License Agreement (CLA) — this grants us the right to re-license the project in the future
- Follow the existing code style
- Test your changes locally
See CONTRIBUTING.md for full guidelines.
WebRouter is available in two editions. See the full comparison on our website.
| Feature | Community | Enterprise |
|---|---|---|
| Price | Free | Custom |
| Max concurrent | 50 | Customizable |
| SSO / SAML / OIDC | — | ✅ |
| Cluster mode | — | ✅ |
| Audit logging | Basic | Custom rules |
| Knowledge Base & RAG | Basic | Advanced |
| License | BSL 1.1 → Apache 2.0 (2029) | Proprietary EULA |
See LICENSE for the full text and LICENSING.md for the dual-edition strategy.
WebRouter is built with:
- Flask — Python web framework
- modernc.org/sqlite — Pure-Go SQLite (no CGO)
- APScheduler — Job scheduling
- Font Awesome — Icons
One gateway. All AI APIs.




