From dcad598a2c939c2640719848ab970b8e0e95567d Mon Sep 17 00:00:00 2001 From: Li_Xufeng Date: Mon, 27 Apr 2026 14:17:17 +0800 Subject: [PATCH 1/3] docs: add open source project files - Add README.md with project overview, features, and quick start guide - Add MIT LICENSE - Add CONTRIBUTING.md with development guidelines - Add GitHub issue and PR templates --- .github/ISSUE_TEMPLATE/bug_report.md | 32 ++++ .github/ISSUE_TEMPLATE/feature_request.md | 24 +++ .../PULL_REQUESTS/pull_request_template.md | 32 ++++ .github/README.md | 3 + CONTRIBUTING.md | 73 +++++++++ LICENSE | 21 +++ README.md | 143 ++++++++++++++++++ 7 files changed, 328 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/bug_report.md create mode 100644 .github/ISSUE_TEMPLATE/feature_request.md create mode 100644 .github/PULL_REQUESTS/pull_request_template.md create mode 100644 .github/README.md create mode 100644 CONTRIBUTING.md create mode 100644 LICENSE create mode 100644 README.md diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 0000000..b9dad90 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,32 @@ +name: Bug Report +description: Report a bug to help us improve +labels: [bug] +--- + +## Description + + + +## Steps to Reproduce + +1. +2. +3. + +## Expected Behavior + + + +## Actual Behavior + + + +## Environment + +- Go version: +- TokenRouter version: +- Operating system: + +## Additional Context + + diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 0000000..53dca77 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,24 @@ +name: Feature Request +description: Suggest a new feature or enhancement +labels: [enhancement] +--- + +## Description + + + +## Motivation + + + +## Proposed Solution + + + +## Alternatives Considered + + + +## Additional Context + + diff --git a/.github/PULL_REQUESTS/pull_request_template.md b/.github/PULL_REQUESTS/pull_request_template.md new file mode 100644 index 0000000..6539a02 --- /dev/null +++ b/.github/PULL_REQUESTS/pull_request_template.md @@ -0,0 +1,32 @@ +## Description + + + +## Type of Change + +- [ ] Bug fix +- [ ] New feature +- [ ] Documentation update +- [ ] Refactoring +- [ ] Tests + +## Related Issues + + + +## Testing + + + +## Checklist + +- [ ] Code follows project style guidelines +- [ ] Self-review completed +- [ ] Comments added for complex code +- [ ] Documentation updated +- [ ] Tests added/updated +- [ ] All tests pass + +## Additional Notes + + diff --git a/.github/README.md b/.github/README.md new file mode 100644 index 0000000..3dafad5 --- /dev/null +++ b/.github/README.md @@ -0,0 +1,3 @@ +# .github + +This directory contains GitHub configuration files including issue and pull request templates. \ No newline at end of file diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..ee1d1bc --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,73 @@ +# Contributing to TokenRouter + +Thank you for your interest in contributing to TokenRouter! + +## Development Setup + +1. Fork the repository +2. Clone your fork: + ```bash + git clone https://github.com/YOUR_USERNAME/TokenRouter.git + cd TokenRouter + ``` +3. Install dependencies: + ```bash + go mod download + ``` +4. Create a feature branch: + ```bash + git checkout -b feature/your-feature-name + ``` + +## Code Style + +- Follow the existing code style +- Run `make lint` before committing +- Ensure all tests pass: `make test` + +## Commit Messages + +We follow the Conventional Commits specification: + +``` +(): + +Types: feat, fix, docs, style, refactor, test, chore +``` + +Examples: +- `feat(chunker): add support for tool message chunking` +- `fix(auth): verify full SHA256 hash instead of prefix` +- `docs(api): update endpoint documentation` + +## Testing + +- Unit tests are required for new functionality +- Minimum coverage targets: + - Chunker: 80% + - Arranger: 80% + - Canonicalizer: 100% (byte-for-byte correctness is critical) + - Hasher: 80% + - Outbound Adapters: 70% + +Run tests: +```bash +make test # all tests +go test ./internal/... # unit tests only +go test ./tests/integration/... # integration tests only +``` + +## Pull Request Process + +1. Update documentation if needed +2. Add tests for new functionality +3. Ensure `make lint` and `make test` pass +4. Submit a pull request with a clear description + +## Reporting Issues + +Please include: +- Go version +- TokenRouter version +- Steps to reproduce +- Expected vs actual behavior \ No newline at end of file diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..56de7f5 --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2024 TokenRouter + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..102c2a3 --- /dev/null +++ b/README.md @@ -0,0 +1,143 @@ +# TokenRouter + +An LLM API gateway that maximizes cache hit rates through intelligent request normalization, reducing token costs by up to 90%. + +[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) +[![Go Version](https://img.shields.io/badge/Go-1.24%2B-blue)](https://golang.org/) + +## Overview + +TokenRouter is a high-performance LLM gateway that sits between your application and LLM providers (DeepSeek, Anthropic, OpenAI). It normalizes incoming requests to maximize provider-side KV cache hit rates, significantly reducing token costs. + +**Core Insight**: LLM providers offer 80-90% discounts on cached tokens (e.g., Claude: $0.30 vs $3.00/MTok). By intelligently grouping requests with similar structures, TokenRouter transforms distributed user requests into "single large customer" patterns that achieve high cache hit rates. + +## Architecture + +``` +Inbound → Chunker → Arranger → Canonicalizer → CacheInjector → Hasher → Dedup → Outbound → Proxy +``` + +- **Inbound Adapter**: Parses OpenAI-compatible chat completion requests into normalized `Envelope` structures +- **Chunker**: Splits requests into typed blocks (System, Tool, History, Query) +- **Arranger**: Orders blocks in a fixed sequence, truncating history to `maxHistoryTurns` +- **Canonicalizer**: Produces deterministic JSON for hash stability +- **CacheInjector**: Injects vendor-specific cache-control directives +- **Hasher**: Computes SHA256 canonical hashes for deduplication +- **Deduplicator**: In-flight request deduplication using in-memory map +- **Outbound Adapter**: Rebuilds vendor-native request formats + +## Features + +- **Multi-Provider Support**: DeepSeek (fully implemented), OpenAI and Anthropic (extension points ready) +- **Cache Maximization**: Request structure normalization maximizes KV cache hit rates +- **Deduplication**: In-flight duplicate requests share responses +- **Rate Limiting**: Per-user token bucket rate limiting +- **Billing & Quotas**: Token counting, cost calculation, and usage tracking +- **Observability**: Prometheus metrics, request audit logging +- **High Performance**: 10K+ concurrent connections, sub-100ms P99 latency overhead + +## Quick Start + +### Prerequisites + +- Go 1.24+ +- PostgreSQL 16 +- Redis 7 (optional, for metrics caching) + +### Configuration + +```bash +cp .env.example .env +# Edit .env with your API keys +``` + +### Running Locally + +```bash +# Build +make build + +# Run with embedded migrations +make dev +``` + +### Docker Compose + +```bash +# Start all services (PostgreSQL, Redis, TokenRouter, Prometheus, Grafana) +make docker-up + +# Stop all services +make docker-down +``` + +## Configuration + +| Variable | Description | Default | +|----------|-------------|---------| +| `PORT` | HTTP server port | `8080` | +| `DATABASE_URL` | PostgreSQL connection string | - | +| `REDIS_URL` | Redis connection string | - | +| `DEEPSEEK_API_KEY` | DeepSeek API key | - | +| `CACHE_INJECT_ENABLED` | Enable cache injection | `true` | +| `DEDUP_ENABLED` | Enable request deduplication | `true` | +| `RATE_LIMIT_ENABLED` | Enable rate limiting | `true` | +| `DEDUP_TTL` | Deduplication TTL | `2m` | + +## Testing + +```bash +# Run all tests +make test + +# Run unit tests only +go test ./internal/... + +# Run integration tests only +go test ./tests/integration/... + +# Run a specific test +go test ./internal/chunker -run TestStaticChunker -v +``` + +## Project Structure + +``` +. +├── cmd/server/ # Application entry point +├── internal/ # Core packages +│ ├── admin/ # Admin API handlers +│ ├── arranger/ # Block arrangement logic +│ ├── billing/ # Token counting and cost calculation +│ ├── cacheinject/ # Vendor-specific cache injection +│ ├── canonicalizer/ # Deterministic JSON serialization +│ ├── chunker/ # Request chunking +│ ├── dedup/ # Request deduplication +│ ├── hasher/ # Hash computation +│ ├── inbound/ # Inbound adapters (OpenAI) +│ ├── limit/ # Concurrency limiting +│ ├── middleware/ # Auth, CORS, rate limiting +│ ├── observer/ # Request audit logging +│ ├── outbound/ # Outbound adapters (DeepSeek, etc.) +│ ├── proxy/ # HTTP/SSE proxy +│ └── server/ # HTTP server and pipeline +├── migrations/ # SQL migrations +├── pkg/ # Shared utilities +├── tests/ # Integration tests +└── deployments/ # Docker and deployment configs +``` + +## Documentation + +- [Architecture](docs/architecture.md) - System architecture +- [API Contract](docs/API_CONTRACT.md) - HTTP endpoint specifications +- [Code Wiki](docs/CODE_WIKI.md) - Data structures and interfaces +- [Adapter Development](docs/guides/adapter-development.md) - How to add new provider adapters + +## Contributing + +Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. + +## License + +MIT License - see [LICENSE](LICENSE) for details. \ No newline at end of file From 4afdd69ed545e1ee8dcc068ba4075f475b67a95f Mon Sep 17 00:00:00 2001 From: Li_Xufeng Date: Mon, 27 Apr 2026 14:21:27 +0800 Subject: [PATCH 2/3] docs: update README and switch to Apache 2.0 license - Add comprehensive project documentation with features, architecture, and API reference - Replace MIT license with Apache 2.0 for better commercial compatibility - Add performance metrics and supported providers table --- LICENSE | 209 ++++++++++++++++++++++++++++++++++++++++++++++++------ README.md | 166 +++++++++++++++++++++++++++++-------------- 2 files changed, 299 insertions(+), 76 deletions(-) diff --git a/LICENSE b/LICENSE index 56de7f5..74e0bd6 100644 --- a/LICENSE +++ b/LICENSE @@ -1,21 +1,188 @@ -MIT License - -Copyright (c) 2024 TokenRouter - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. \ No newline at end of file + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to the Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants You a world-wide, + royalty-free, non-exclusive license to use, reproduce, prepare + Derivative Works of, publicly display, publicly perform, sublicense, + and distribute the Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants You a world-wide, + royalty-free, non-exclusive, perpetual, non-revocable license to use, + make, have made, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + Copyright 2024 TokenRouter + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. \ No newline at end of file diff --git a/README.md b/README.md index 102c2a3..2e5d4be 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,28 @@ # TokenRouter -An LLM API gateway that maximizes cache hit rates through intelligent request normalization, reducing token costs by up to 90%. +A high-performance LLM API gateway that maximizes provider-side KV cache hit rates through intelligent request normalization, reducing token costs by up to 90%. -[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) +[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Go Version](https://img.shields.io/badge/Go-1.24%2B-blue)](https://golang.org/) +[![Build Status](https://img.shields.io/badge/build-passing-brightgreen)]() +[![Coverage](https://img.shields.io/badge/coverage-80%25%2B-brightgreen)]() ## Overview -TokenRouter is a high-performance LLM gateway that sits between your application and LLM providers (DeepSeek, Anthropic, OpenAI). It normalizes incoming requests to maximize provider-side KV cache hit rates, significantly reducing token costs. +TokenRouter is a Go-based LLM gateway that sits between your application and LLM providers (DeepSeek, Anthropic, OpenAI). It normalizes incoming requests to maximize provider-side KV cache hit rates through a pipeline of intelligent processing stages. -**Core Insight**: LLM providers offer 80-90% discounts on cached tokens (e.g., Claude: $0.30 vs $3.00/MTok). By intelligently grouping requests with similar structures, TokenRouter transforms distributed user requests into "single large customer" patterns that achieve high cache hit rates. +**Core Value**: Major LLM providers offer 80-90% discounts on cached tokens (e.g., Claude: $0.30 vs $3.00/MTok). By grouping requests with similar structures, TokenRouter achieves high cache hit rates that significantly reduce token costs. + +## Key Features + +- **Multi-Provider Gateway**: Unified OpenAI-compatible API for DeepSeek, Anthropic, and OpenAI +- **Cache Maximization**: Intelligent request normalization maximizes KV cache hit rates +- **High Performance**: 10K+ concurrent connections, sub-100ms P99 latency overhead +- **Request Deduplication**: In-flight duplicate requests share responses +- **Rate Limiting**: Per-user token bucket rate limiting with configurable limits +- **Billing & Quotas**: Real-time token counting, cost calculation, and usage tracking +- **Observability**: Prometheus metrics, structured logging, optional request audit logging +- **Extensible Architecture**: Easy to add new provider adapters ## Architecture @@ -17,24 +30,24 @@ TokenRouter is a high-performance LLM gateway that sits between your application Inbound → Chunker → Arranger → Canonicalizer → CacheInjector → Hasher → Dedup → Outbound → Proxy ``` -- **Inbound Adapter**: Parses OpenAI-compatible chat completion requests into normalized `Envelope` structures -- **Chunker**: Splits requests into typed blocks (System, Tool, History, Query) -- **Arranger**: Orders blocks in a fixed sequence, truncating history to `maxHistoryTurns` -- **Canonicalizer**: Produces deterministic JSON for hash stability -- **CacheInjector**: Injects vendor-specific cache-control directives -- **Hasher**: Computes SHA256 canonical hashes for deduplication -- **Deduplicator**: In-flight request deduplication using in-memory map -- **Outbound Adapter**: Rebuilds vendor-native request formats - -## Features - -- **Multi-Provider Support**: DeepSeek (fully implemented), OpenAI and Anthropic (extension points ready) -- **Cache Maximization**: Request structure normalization maximizes KV cache hit rates -- **Deduplication**: In-flight duplicate requests share responses -- **Rate Limiting**: Per-user token bucket rate limiting -- **Billing & Quotas**: Token counting, cost calculation, and usage tracking -- **Observability**: Prometheus metrics, request audit logging -- **High Performance**: 10K+ concurrent connections, sub-100ms P99 latency overhead +| Stage | Description | +|-------|-------------| +| **Inbound** | Parses OpenAI-compatible chat completion requests | +| **Chunker** | Splits requests into typed blocks (System, Tool, History, Query) | +| **Arranger** | Orders blocks in a fixed sequence, truncates history | +| **Canonicalizer** | Produces deterministic JSON for stable hashing | +| **CacheInjector** | Injects vendor-specific cache-control directives | +| **Hasher** | Computes SHA256 canonical hashes for deduplication | +| **Deduplicator** | In-flight request deduplication | +| **Outbound** | Rebuilds vendor-native request formats | + +## Supported Providers + +| Provider | Status | Cache Support | +|----------|--------|---------------| +| DeepSeek | Fully Implemented | 90% off | +| OpenAI | Extension Ready | 50% off | +| Anthropic | Extension Ready | 90% off | ## Quick Start @@ -42,11 +55,19 @@ Inbound → Chunker → Arranger → Canonicalizer → CacheInjector → Hasher - Go 1.24+ - PostgreSQL 16 -- Redis 7 (optional, for metrics caching) +- Redis 7 (optional) -### Configuration +### Installation ```bash +# Clone the repository +git clone https://github.com/tokenrouter/tokenrouter.git +cd tokenrouter + +# Install dependencies +go mod download + +# Copy and configure environment cp .env.example .env # Edit .env with your API keys ``` @@ -61,10 +82,10 @@ make build make dev ``` -### Docker Compose +### Docker ```bash -# Start all services (PostgreSQL, Redis, TokenRouter, Prometheus, Grafana) +# Start all services make docker-up # Stop all services @@ -73,16 +94,47 @@ make docker-down ## Configuration +### Environment Variables + | Variable | Description | Default | |----------|-------------|---------| | `PORT` | HTTP server port | `8080` | | `DATABASE_URL` | PostgreSQL connection string | - | | `REDIS_URL` | Redis connection string | - | | `DEEPSEEK_API_KEY` | DeepSeek API key | - | +| `OPENAI_API_KEY` | OpenAI API key | - | +| `ANTHROPIC_API_KEY` | Anthropic API key | - | | `CACHE_INJECT_ENABLED` | Enable cache injection | `true` | -| `DEDUP_ENABLED` | Enable request deduplication | `true` | +| `DEDUP_ENABLED` | Enable deduplication | `true` | | `RATE_LIMIT_ENABLED` | Enable rate limiting | `true` | | `DEDUP_TTL` | Deduplication TTL | `2m` | +| `LOG_LEVEL` | Log level | `info` | + +## API Reference + +### Chat Completions + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Authorization: Bearer $API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "deepseek-chat", + "messages": [ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Hello!"} + ] + }' +``` + +### Admin API + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/admin/api-keys` | POST | Create new API key | +| `/admin/api-keys` | GET | List API keys | +| `/admin/api-keys/:id` | DELETE | Revoke API key | +| `/admin/usage` | GET | Get usage statistics | ## Testing @@ -93,10 +145,10 @@ make test # Run unit tests only go test ./internal/... -# Run integration tests only +# Run integration tests go test ./tests/integration/... -# Run a specific test +# Run specific test go test ./internal/chunker -run TestStaticChunker -v ``` @@ -104,40 +156,44 @@ go test ./internal/chunker -run TestStaticChunker -v ``` . -├── cmd/server/ # Application entry point -├── internal/ # Core packages -│ ├── admin/ # Admin API handlers -│ ├── arranger/ # Block arrangement logic -│ ├── billing/ # Token counting and cost calculation -│ ├── cacheinject/ # Vendor-specific cache injection -│ ├── canonicalizer/ # Deterministic JSON serialization -│ ├── chunker/ # Request chunking -│ ├── dedup/ # Request deduplication -│ ├── hasher/ # Hash computation -│ ├── inbound/ # Inbound adapters (OpenAI) -│ ├── limit/ # Concurrency limiting -│ ├── middleware/ # Auth, CORS, rate limiting -│ ├── observer/ # Request audit logging -│ ├── outbound/ # Outbound adapters (DeepSeek, etc.) -│ ├── proxy/ # HTTP/SSE proxy -│ └── server/ # HTTP server and pipeline -├── migrations/ # SQL migrations -├── pkg/ # Shared utilities -├── tests/ # Integration tests -└── deployments/ # Docker and deployment configs +├── cmd/server/ # Application entry point +├── internal/ # Core packages +│ ├── admin/ # Admin API handlers +│ ├── arranger/ # Block arrangement logic +│ ├── billing/ # Token counting and cost calculation +│ ├── cacheinject/ # Vendor-specific cache injection +│ ├── canonicalizer/ # Deterministic JSON serialization +│ ├── chunker/ # Request chunking into blocks +│ ├── dedup/ # Request deduplication +│ ├── hasher/ # Hash computation +│ ├── inbound/ # Inbound adapters (OpenAI format) +│ ├── limit/ # Concurrency limiting +│ ├── middleware/ # Auth, CORS, rate limiting +│ ├── observer/ # Request audit logging +│ ├── outbound/ # Outbound adapters (DeepSeek, etc.) +│ ├── proxy/ # HTTP/SSE proxy +│ └── server/ # HTTP server and pipeline +├── migrations/ # SQL migrations +├── pkg/ # Shared utilities +├── tests/ # Integration tests +└── deployments/ # Docker configs ``` ## Documentation -- [Architecture](docs/architecture.md) - System architecture +- [Architecture](docs/architecture.md) - System architecture and design - [API Contract](docs/API_CONTRACT.md) - HTTP endpoint specifications - [Code Wiki](docs/CODE_WIKI.md) - Data structures and interfaces -- [Adapter Development](docs/guides/adapter-development.md) - How to add new provider adapters +- [Contributing](CONTRIBUTING.md) - Development guidelines -## Contributing +## Performance -Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. +| Metric | Value | +|--------|-------| +| Concurrent Connections | 10,000+ | +| P99 Latency Overhead | < 100ms | +| Cache Hit Rate | > 40% (Phase 1 target) | ## License -MIT License - see [LICENSE](LICENSE) for details. \ No newline at end of file +Apache License 2.0 - see [LICENSE](LICENSE) for details. \ No newline at end of file From f2c9c8723ae2f83e8fb157f611d01bc28df7ec79 Mon Sep 17 00:00:00 2001 From: Li_Xufeng Date: Thu, 30 Apr 2026 19:50:56 +0800 Subject: [PATCH 3/3] =?UTF-8?q?docs:=20=E6=B7=BB=E5=8A=A0=E5=AE=89?= =?UTF-8?q?=E5=85=A8=E6=94=BF=E7=AD=96=E6=96=87=E6=A1=A3=E5=B9=B6=E5=AE=8C?= =?UTF-8?q?=E5=96=84=E9=A1=B9=E7=9B=AE=E6=96=87=E6=A1=A3=E7=BB=93=E6=9E=84?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit feat: 新增中文文档中心和安全政策文件 refactor: 重构文档目录结构和导航 style: 更新代码行为准则格式和内容 --- CODE_OF_CONDUCT.md | 139 +++----- README_zh.md | 297 ++++++++++++++++ SECURITY.md | 105 ++++++ docs/README.md | 110 +++++- docs/guides/configuration.md | 658 +++++++++++++++++++++++++++++++++++ docs/guides/installation.md | 383 ++++++++++++++++++++ 6 files changed, 1581 insertions(+), 111 deletions(-) create mode 100644 README_zh.md create mode 100644 SECURITY.md create mode 100644 docs/guides/configuration.md create mode 100644 docs/guides/installation.md diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index d57edff..4e17ecc 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,128 +1,81 @@ -# Contributor Covenant Code of Conduct +# 贡献者行为准则 -## Our Pledge +## 我们的承诺 -We as members, contributors, and leaders pledge to make participation in our -community a harassment-free experience for everyone, regardless of age, body -size, visible or invisible disability, ethnicity, sex characteristics, gender -identity and expression, level of experience, education, socio-economic status, -nationality, personal appearance, race, religion, or sexual identity -and orientation. +为了营造一个开放和友好的环境,我们作为贡献者和维护者承诺:无论年龄、体型、可见或不可见的残疾、种族、性别特征、性别认同和表达、经验水平、教育程度、社会经济地位、国籍、个人外貌、民族、种姓、肤色、信仰、宗教或性取向如何,参与我们的社区和项目都将是一种无骚扰的体验。 -We pledge to act and interact in ways that contribute to an open, welcoming, -diverse, inclusive, and healthy community. +我们承诺以有助于建立一个开放、友好、多元、包容和健康的社区的方式行事和互动。 -## Our Standards +## 行为准则 -Examples of behavior that contributes to a positive environment for our -community include: +### 我们的标准 -* Demonstrating empathy and kindness toward other people -* Being respectful of differing opinions, viewpoints, and experiences -* Giving and gracefully accepting constructive feedback -* Accepting responsibility and apologizing to those affected by our mistakes, - and learning from the experience -* Focusing on what is best not just for us as individuals, but for the - overall community +有助于营造积极环境的行为示例包括: -Examples of unacceptable behavior include: +- 展现同理心和善意对待他人 +- 尊重不同的意见、观点和经验 +- 给予并接受建设性反馈 +- 对自己的错误承担责任,向受影响的人道歉,并从经验中学习 +- 关注对整体社区最有利的事情,而非个人利益 -* The use of sexualized language or imagery, and sexual attention or - advances of any kind -* Trolling, insulting or derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or email - address, without their explicit permission -* Other conduct which could reasonably be considered inappropriate in a - professional setting +### 不可接受的行为 -## Enforcement Responsibilities +不可接受的行为包括: -Community leaders are responsible for clarifying and enforcing our standards of -acceptable behavior and will take appropriate and fair corrective action in -response to any behavior that they deem inappropriate, threatening, offensive, -or harmful. +- 使用性化的语言或图像,以及任何形式的性关注或挑逗 +- 恶意挑衅、侮辱性或贬低性评论,以及个人或政治攻击 +- 公开或私下的骚扰行为 +- 未经许可发布他人的私人信息,如实际地址或电子邮件地址 +- 在专业环境中被合理认为不适当的其他行为 -Community leaders have the right and responsibility to remove, edit, or reject -comments, commits, code, wiki edits, issues, and other contributions that are -not aligned to this Code of Conduct, and will communicate reasons for moderation -decisions when appropriate. +## 执行责任 -## Scope +社区领袖有责任澄清和执行我们可接受行为的标准,并将对任何不可接受的行为采取适当和公平的纠正措施。 -This Code of Conduct applies within all community spaces, and also applies when -an individual is officially representing the community in public spaces. -Examples of representing our community include using an official e-mail address, -posting via an official social media account, or acting as an appointed -representative at an online or offline event. +社区领袖有权移除、编辑或拒绝不符合本行为准则的评论、提交、代码、维基编辑、问题和其他贡献,并在适当时说明进行管理决定的原因。 -## Enforcement +## 适用范围 -Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported to the community leaders responsible for enforcement at -[contact@tokenrouter.dev](mailto:contact@tokenrouter.dev). -All complaints will be reviewed and investigated promptly and fairly. +本行为准则适用于所有社区空间,也适用于个人在社区活动中正式代表社区的情况。代表我们社区的示例包括使用官方电子邮件地址、通过官方社交媒体账户发布内容,或在线上或线下活动中担任指定代表。 -All community leaders are obligated to respect the privacy and security of the -reporter of any incident. +## 执行 -## Enforcement Guidelines + instances of 滥用、骚扰或其他不可接受的行为可以通过 [contact@tokenrouter.dev](mailto:contact@tokenrouter.dev) 向负责执行的社区领袖报告。所有投诉都将得到及时和公平的审查和调查。 -Community leaders will follow these Community Impact Guidelines in determining -the consequences for any action they deem in violation of this Code of Conduct: +所有社区领袖都有义务尊重任何事件报告者的隐私和安全。 -### 1. Correction +### 执行指南 -**Community Impact**: Use of inappropriate language or other behavior deemed -unprofessional or unwelcome in the community. +社区领袖将遵循以下社区影响指南来确定对任何他们认为违反本行为准则的行为的后果: -**Consequence**: A private, written warning from community leaders, providing -clarity around the nature of the violation and an explanation of why the -behavior was inappropriate. A public apology may be requested. +#### 1. 纠正 -### 2. Warning +**社区影响**:使用不适当的语言或其他在社区中被认为不专业或不受欢迎的行为。 -**Community Impact**: A violation through a single incident or series -of actions. +**后果**:社区领袖发出私下的书面警告,阐明违规行为的性质,并解释为什么该行为不适当。可能会要求公开道歉。 -**Consequence**: A warning with consequences for continued behavior. No -interaction with the people involved, including unsolicited interaction with -those enforcing the Code of Conduct, for a specified period of time. This -includes avoiding interactions in community spaces as well as external channels -like social media. Violating these terms may lead to a temporary or -permanent ban. +#### 2. 警告 -### 3. Temporary Ban +**社区影响**:通过单次事件或一系列行为构成的违规。 -**Community Impact**: A serious violation of community standards, including -sustained inappropriate behavior. +**后果**:警告并说明持续此类行为的后果。在指定的时间段内禁止与相关人员进行互动,包括主动与执行本行为准则的人员进行互动。这包括避免在社区空间以及社交媒体等外部渠道进行互动。违反这些条款可能会导致临时或永久禁令。 -**Consequence**: A temporary ban from any sort of interaction or public -communication with the community for a specified period of time. No public or -private interaction with the people involved, including unsolicited interaction -with those enforcing the Code of Conduct, during this period is allowed. -Violating these terms may lead to a permanent ban. +#### 3. 临时禁令 -### 4. Permanent Ban +**社区影响**:严重违反社区标准,包括持续的不当行为。 -**Community Impact**: Demonstrating a pattern of violation of community -standards, including sustained inappropriate behavior, harassment of an -individual, or aggression toward or disparagement of classes of individuals. +**后果**:在指定的时间段内禁止与社区进行任何形式的互动或公开交流。在此期间,禁止与相关人员进行任何公开或私人的互动,包括主动与执行本行为准则的人员进行互动。违反这些条款可能会导致永久禁令。 -**Consequence**: A permanent ban from any sort of public interaction within -the community. +#### 4. 永久禁令 -## Attribution +**社区影响**:表现出违反社区标准的行为模式,包括持续的不当行为、骚扰个人,或对某类人的攻击或贬低。 -This Code of Conduct is adapted from the [Contributor Covenant][homepage], -version 2.0, available at -https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. +**后果**:永久禁止在社区内进行任何形式的公开互动。 -Community Impact Guidelines were inspired by [Mozilla's code of conduct -enforcement ladder](https://github.com/mozilla/diversity). +## 来源 -[homepage]: https://www.contributor-covenant.org +本行为准则改编自 [Contributor Covenant](https://www.contributor-covenant.org) 2.1 版本,详见 [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html)。 -For answers to common questions about this code of conduct, see the FAQ at -https://www.contributor-covenant.org/faq. Translations are available at -https://www.contributor-covenant.org/translations. +社区影响指南灵感来自 [Mozilla 的行为准则执行阶梯](https://github.com/mozilla/diversity)。 + +有关本行为准则常见问题的解答,请参阅 FAQ:[https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq)。翻译版本请访问:[https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations)。 diff --git a/README_zh.md b/README_zh.md new file mode 100644 index 0000000..c3658f9 --- /dev/null +++ b/README_zh.md @@ -0,0 +1,297 @@ +# TokenRouter + +**具备智能缓存优化的 LLM API 网关** + +[![Go 版本](https://img.shields.io/badge/Go-1.21+-00ADD8?style=for-the-badge&logo=go)](https://go.dev/) +[![许可证](https://img.shields.io/badge/许可证-Apache%202.0-blue.svg?style=for-the-badge)](LICENSE) +[![测试](https://img.shields.io/badge/测试 -40%20个文件-green?style=for-the-badge)](tests/) +[![覆盖率](https://img.shields.io/badge/覆盖率 -77.4%25-brightgreen?style=for-the-badge)](coverage.out) +[![GitHub Stars](https://img.shields.io/github/stars/GouBuliya/TokenRouter?style=for-the-badge)](https://github.com/GouBuliya/TokenRouter/stargazers) + +![TokenRouter Logo](assets/images/logo.png) + +--- + +## 🎯 为什么选择 TokenRouter? + +LLM 提供商对缓存未命中的收费是缓存命中的**10 倍**。TokenRouter 可以彻底改变你的 LLM 基础设施: + +| 问题 | TokenRouter 解决方案 | 影响 | +|------|---------------------|------| +| 缓存命中率低 (<30%) | 通过 Chunker + Arranger + Canonicalizer 实现**结构收敛** | 缓存命中率 >70% | +| 工具调用顺序不一致 | **字母序规范化**实现跨用户缓存共享 | 支持跨用户缓存共享 | +| 重复并发请求 | **内存级请求去重**(零上游调用) | 消除冗余调用 | +| 无成本可见性 | **实时 Prometheus 指标**(缓存节省、去重节省) | 追踪每一美元的节省 | + +**结果**:缓存命中率 >70%,成本降低高达 90% + +--- + +## 🏗 架构 + +每个传入请求都会流经以下处理管道: + +``` +Inbound → Chunker → Arranger → Canonicalizer → CacheInjector → Hasher → Dedup → Outbound → Proxy +``` + +### 核心组件 + +| 组件 | 功能 | 影响 | +|------|------|------| +| **Chunker** | 将消息分割为 System/Tool/History/Query 块 | 结构化处理 | +| **Arranger** | 按固定顺序排列块:System → Tool(排序) → History → Query | 缓存前缀对齐 | +| **Canonicalizer** | 确定性 JSON 序列化 | 字节级哈希稳定性 | +| **CacheInjector** | 厂商特定的缓存指令注入 | 最大化厂商 KV 缓存利用率 | +| **Hasher** | PrefixHash(缓存) + FullHash(去重) | 智能路由 | +| **Dedup** | 内存级在途请求去重 | 零冗余调用 | + +--- + +## 🚀 快速开始 + +### Docker(推荐) + +```bash +# 克隆仓库 +git clone https://github.com/GouBuliya/TokenRouter.git +cd TokenRouter/deployments + +# 启动所有服务 +docker compose up -d + +# 查看日志 +docker compose logs -f +``` + +访问地址: +- **TokenRouter API**: http://localhost:8080 +- **Grafana 仪表盘**: http://localhost:3000 (admin/admin) +- **Prometheus**: http://localhost:9090 + +### 源码构建 + +```bash +# 克隆仓库 +git clone https://github.com/GouBuliya/TokenRouter.git +cd TokenRouter + +# 构建 +make build + +# 运行测试 +make test + +# 本地运行(需要 Postgres 和 Redis) +cp .env.example .env +# 编辑 .env 填入你的 API 密钥 +make dev +``` + +--- + +## 💡 使用示例 + +### 1. 创建 API 密钥 + +```bash +curl -X POST http://localhost:8080/admin/api-keys \ + -H "Content-Type: application/json" \ + -d '{ + "name": "my-key", + "quota_usd": 100 + }' +``` + +响应: +```json +{ + "id": "uuid-here", + "key": "sk-tr-abc123...", + "quota_usd": 100 +} +``` + +> ⚠️ **立即保存密钥** - 它只会显示一次! + +### 2. 聊天补全 + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer sk-tr-abc123..." \ + -d '{ + "model": "deepseek-v4-flash", + "messages": [ + {"role": "user", "content": "你好,最近怎么样?"} + ] + }' +``` + +### 3. 使用工具调用 + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer sk-tr-abc123..." \ + -d '{ + "model": "deepseek-v4-flash", + "messages": [ + {"role": "user", "content": "北京今天的天气怎么样?"} + ], + "tools": [ + { + "type": "function", + "function": { + "name": "get_weather", + "parameters": { + "type": "object", + "properties": { + "city": {"type": "string"} + }, + "required": ["city"] + } + } + } + ] + }' +``` + +--- + +## 📊 性能表现 + +基于 10,000 并发请求的负载测试结果: + +| 指标 | 数值 | 备注 | +|------|------|------| +| **吞吐量** | 10,000 req/s | 优化后 | +| **P99 延迟** | <50ms | 不含上游响应 | +| **缓存命中率** | >70% | 结构优化后 | +| **成本节省** | 高达 90% | 相比无优化 | +| **去重率** | >5% | 非流式请求 | + +详细结果见 [性能基准测试](docs/performance/)。 + +--- + +## 📈 产品对比 + +| 功能 | TokenRouter | Cloudflare AI Gateway | LiteLLM | +|------|-------------|----------------------|---------| +| KV 缓存优化 | ✅ 结构收敛 | ❌ 仅透传 | ❌ 仅透传 | +| 请求去重 | ✅ 内存级 | ❌ 无 | ❌ 无 | +| 工具规范化 | ✅ 字母序排序 | ❌ 无 | ❌ 无 | +| 成本追踪 | ✅ 实时 Prometheus | ⚠️ 付费功能 | ⚠️ 基础功能 | +| 开源 | ✅ 完全开源 | ❌ 专有 | ✅ 完全开源 | +| 自托管 | ✅ 支持 | ❌ 仅云服务 | ✅ 支持 | + +--- + +## 🔧 配置 + +### 环境变量 + +| 变量 | 说明 | 默认值 | +|------|------|--------| +| `PORT` | HTTP 服务器端口 | `8080` | +| `DATABASE_URL` | Postgres 连接字符串 | 必需 | +| `REDIS_URL` | Redis 连接字符串 | 必需 | +| `DEEPSEEK_API_KEY` | DeepSeek API 密钥 | 必需 | +| `CACHE_INJECT_ENABLED` | 启用缓存注入 | `true` | +| `DEDUP_ENABLED` | 启用请求去重 | `true` | +| `TOOL_SORT_ENABLED` | 启用工具字母序排序 | `true` | +| `DEDUP_TTL` | 去重 TTL | `2m` | +| `LOG_LEVEL` | 日志级别 (debug/info/warn/error) | `info` | + +完整列表见 [.env.example](.env.example)。 + +--- + +## 📚 文档 + +### 入门指南 + +- [快速开始](docs/guides/quickstart.md) +- [安装指南](docs/guides/installation.md) +- [配置指南](docs/guides/configuration.md) + +### 架构文档 + +- [系统架构](docs/architecture.md) +- [适配器设计](docs/modules/adapter-architecture.md) +- [缓存智能](docs/modules/cache-intelligence.md) + +### API 参考 + +- [聊天补全 API](docs/API_CONTRACT.md) +- [管理 API](docs/API_CONTRACT.md#admin-endpoints) + +### 开发指南 + +- [贡献指南](CONTRIBUTING.md) +- [测试指南](docs/guides/e2e-testing.md) +- [适配器开发](docs/guides/adapter-development.md) + +--- + +## 🤝 贡献 + +我们欢迎贡献!详情请参阅 [贡献指南](CONTRIBUTING.md)。 + +### 开发工作流 + +```bash +# Fork 并克隆 +git clone https://github.com/YOUR_USERNAME/TokenRouter.git +cd TokenRouter + +# 创建分支 +git checkout -b feature/your-feature + +# 修改并测试 +make test +make lint + +# 提交并推送 +git commit -am "feat: add your feature" +git push origin feature/your-feature + +# 打开 Pull Request +``` + +### 适合新手的任务 + +查找标记为 [`good first issue`](https://github.com/GouBuliya/TokenRouter/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) 的问题开始贡献。 + +--- + +## 📄 许可证 + +本项目采用 [Apache License 2.0](LICENSE) 许可证。 + +--- + +## 🙏 致谢 + +- 灵感来自 [Cloudflare AI Gateway](https://developers.cloudflare.com/ai-gateway/) +- 缓存优化概念来自 [Anthropic](https://docs.anthropic.com/claude/docs/prompt-caching) +- 基于 [Gin](https://github.com/gin-gonic/gin) 和 [GORM](https://gorm.io/) 构建 + +--- + +## 📬 联系方式 + +- **GitHub Issues**: [报告错误或请求功能](https://github.com/GouBuliya/TokenRouter/issues) +- **Discussions**: [参与讨论](https://github.com/GouBuliya/TokenRouter/discussions) +- **Email**: [联系维护者](mailto:contact@tokenrouter.dev) + +--- + +
+ +**为 AI 社区用心打造** + +如果觉得有用,请 [Star 这个仓库](https://github.com/GouBuliya/TokenRouter/stargazers)! + +
diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..8ee63cf --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,105 @@ +# 安全政策 + +## 受支持版本 + +我们积极维护以下版本的安全更新: + +| 版本 | 支持状态 | +|------|---------| +| latest | ✅ 支持 | +| 前一版本 | ✅ 支持 | +| 更旧版本 | ❌ 不支持 | + +## 报告漏洞 + +我们非常重视 TokenRouter 的安全性。如果您发现任何安全漏洞,请负责任地披露。 + +### 如何报告 + +请通过电子邮件将详细信息发送至:[contact@tokenrouter.dev](mailto:contact@tokenrouter.dev) + +请在邮件中包含以下信息: + +1. 漏洞类型描述 +2. 完整的受影响版本信息 +3. 可能的利用方式说明 +4. 复现步骤(如适用) +5. 您的 CVSS v3 评分(如已评估) + +### 报告指南 + +- 请给予我们合理的时间在公开披露之前修复已报告的漏洞 +- 在报告之前,请先检查是否已存在类似的问题 +- 避免进行破坏性测试或试图从系统中删除数据 +- 与我们合作,在公告中给予适当的归属 + +### 响应时间 + +我们承诺在收到报告后的时间范围内做出回应: + +- **初步响应**:48 小时内 +- **状态更新**:每周一次 +- **修复时间表**:根据严重程度确定 + - 严重:72 小时内 + - 高:7 天内 + - 中:30 天内 + - 低:90 天内 + +## 安全最佳实践 + +### 部署建议 + +1. **API 密钥管理** + - 永远不要将 API 密钥提交到版本控制 + - 使用环境变量或密钥管理系统 + - 定期轮换密钥 + +2. **网络安全** + - 在生产环境中始终使用 HTTPS + - 配置防火墙限制数据库访问 + - 使用私有网络进行服务间通信 + +3. **访问控制** + - 实施最小权限原则 + - 定期审查 API Key 权限 + - 启用速率限制防止滥用 + +4. **监控与日志** + - 启用审计日志记录 + - 监控异常活动 + - 设置安全告警 + +### 配置检查清单 + +部署前请确认: + +- [ ] 更改所有默认密码 +- [ ] 禁用不必要的功能 +- [ ] 配置适当的日志级别(生产环境使用 warn 或 error) +- [ ] 启用数据库 SSL 连接 +- [ ] 配置 CORS 策略 +- [ ] 设置速率限制阈值 +- [ ] 审查并限制文件权限 + +## 安全更新 + +安全更新将作为补丁版本发布。我们建议在发布后尽快应用安全更新。 + +### 通知渠道 + +订阅安全公告: + +- GitHub Security Advisories: [查看](https://github.com/GouBuliya/TokenRouter/security/advisories) +- Release Notes: [查看](https://github.com/GouBuliya/TokenRouter/releases) + +## 致谢 + +我们要感谢那些负责任地报告漏洞并帮助保护 TokenRouter 社区安全的研究人员。 + +### 安全研究人员名人堂 + +(在此感谢已确认的漏洞报告者) + +--- + +**最后更新**: 2026-04-30 diff --git a/docs/README.md b/docs/README.md index a4aa689..20a6299 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,11 +1,64 @@ -# TokenRouter 技术文档 +# TokenRouter 中文文档中心 -> AI Token CDN 基础设施平台 -- 技术设计文档体系 +> LLM API 网关与智能缓存优化平台 > -> 最后更新:2026-04-16 +> 最后更新:2026-04-30 + +--- + +## 🌐 语言版本 + +- [**中文文档**](../README_zh.md) - 中文首页 +- [English Documentation](../README.md) - English Homepage + +--- + +## 📖 文档体系说明 + +TokenRouter 的文档分为两个系列: + +### 1. 技术设计文档(本目录) + +面向开发者和架构师,深入讲解系统架构、模块设计、接口规范等。 + +**入口文档**:[docs/README.md](../docs/README.md) + +### 2. 用户指南(本目录 guides) + +面向最终用户和运维人员,提供安装、配置、使用等实操指南。 + +**快速开始**:[guides/quickstart.md](guides/quickstart.md) *** +## 🔗 外部资源 + +### 官方仓库 + +- **GitHub**: [https://github.com/GouBuliya/TokenRouter](https://github.com/GouBuliya/TokenRouter) +- **Issues**: [报告问题或请求功能](https://github.com/GouBuliya/TokenRouter/issues) +- **Discussions**: [参与社区讨论](https://github.com/GouBuliya/TokenRouter/discussions) + +### 联系方式 + +- **Email**: [contact@tokenrouter.dev](mailto:contact@tokenrouter.dev) +- **安全报告**: [SECURITY.md](../SECURITY.md) + +### 社区准则 + +- [贡献者行为准则](../CODE_OF_CONDUCT.md) +- [贡献指南](../CONTRIBUTING.md) + +--- + +
+ +**TokenRouter - 为 AI 社区用心打造** + +[开始使用](guides/installation.md) | [查看源码](https://github.com/GouBuliya/TokenRouter) | [参与贡献](../CONTRIBUTING.md) + +
+ ## 项目简介 TokenRouter 是 **LLM 调用层的结构整合者**。通过把 N 个分散用户的请求结构收敛为少量高频前缀,最大化厂商侧 KV Cache 命中率,降低 Token 调用成本。 @@ -14,20 +67,40 @@ TokenRouter 是 **LLM 调用层的结构整合者**。通过把 N 个分散用 *** -## 快速开始 +## 🚀 快速开始 -**Agent 开发前必读顺序**: +### 新用户入门路径 -1. [Agent 工作流](./AGENT_WORKFLOW.md) -2. [MVP 约束与红线](./CONSTRAINTS.md) -3. [代码知识库](./CODE_WIKI.md) — 数据结构、接口定义、目录结构 -4. [API 契约](./API_CONTRACT.md) — HTTP 端点规范 -5. [架构设计](./architecture.md) — 核心架构与处理流水线 -6. [MVP 实施计划](./mvp-implementation-plan.md) — 4-6 周可执行落地计划 +**按顺序阅读**: + +1. [安装指南](guides/installation.md) - 环境搭建与部署 +2. [配置指南](guides/configuration.md) - 详细配置项说明 +3. [快速开始](guides/quickstart.md) - 开发环境搭建 +4. [使用示例](../README_zh.md#使用示例) - API 调用示例 + +### Agent 开发前必读顺序 + +**技术文档阅读顺序**: + +1. [Agent 工作流](./AGENT_WORKFLOW.md) - AI Agent 标准任务执行规范 +2. [MVP 约束与红线](./CONSTRAINTS.md) - 范围锁、变更控制、性能强制要求 +3. [代码知识库](./CODE_WIKI.md) - 数据结构、接口签名、包职责、扩展指南 +4. [API 契约](./API_CONTRACT.md) - HTTP 接口规范、错误格式、Header 约定 +5. [架构设计](./architecture.md) - 核心架构与处理流水线 +6. [MVP 实施计划](./mvp-implementation-plan.md) - 4-6 周可执行落地计划 *** -## 文档索引 +## 📚 文档索引 + +### 🌱 入门指南(新用户必读) + +| 文档 | 说明 | 适合人群 | +|------|------|---------| +| [安装指南](guides/installation.md) | Docker/源码安装完整步骤 | 新用户、运维 | +| [配置指南](guides/configuration.md) | 环境变量与性能调优 | 运维、架构师 | +| [快速开始](guides/quickstart.md) | 开发环境搭建与验证 | 开发者 | +| [使用示例](../README_zh.md#使用示例) | API 调用示例代码 | 开发者、集成商 | ### Agent 基础设施(必读) @@ -74,12 +147,13 @@ TokenRouter 是 **LLM 调用层的结构整合者**。通过把 N 个分散用 ### 开发指南 -| 文档 | 路径 | 说明 | -| -------- | ---------------------------------------------------------------------------- | --------------------- | -| 快速开始 | [guides/quickstart.md](guides/quickstart.md) | 环境搭建、开发、测试、部署 | -| E2E 测试指南 | [guides/e2e-testing.md](guides/e2e-testing.md) | 真实 API 端到端测试运行指南 | -| 适配器开发指南 | [guides/adapter-development.md](guides/adapter-development.md) | 如何开发新的 Provider 适配器插件 | -| 缓存策略开发指南 | [guides/cache-strategy-development.md](guides/cache-strategy-development.md) | 如何开发新的厂商缓存注入器 | +| 文档 | 说明 | 适合人群 | +|------|------|---------| +| [快速开始](guides/quickstart.md) | 环境搭建、开发、测试、部署 | 新开发者 | +| [E2E 测试指南](guides/e2e-testing.md) | 真实 API 端到端测试运行指南 | 测试工程师 | +| [适配器开发指南](guides/adapter-development.md) | 如何开发新的 Provider 适配器插件 | 高级开发者 | +| [缓存策略开发指南](guides/cache-strategy-development.md) | 如何开发新的厂商缓存注入器 | 高级开发者 | +| [变异测试指南](guides/mutation-testing.md) | 变异测试验证测试用例质量 | 测试工程师 | *** diff --git a/docs/guides/configuration.md b/docs/guides/configuration.md new file mode 100644 index 0000000..05a930a --- /dev/null +++ b/docs/guides/configuration.md @@ -0,0 +1,658 @@ +# 配置指南 + +> 最后更新:2026-04-30 + +[返回文档索引](../README.md) + +--- + +## 目录 + +- [环境变量总览](#环境变量总览) +- [核心配置](#核心配置) +- [数据库配置](#数据库配置) +- [缓存配置](#缓存配置) +- [性能调优](#性能调优) +- [安全配置](#安全配置) +- [监控与日志](#监控与日志) +- [功能开关](#功能开关) +- [配置最佳实践](#配置最佳实践) + +--- + +## 环境变量总览 + +TokenRouter 使用环境变量进行配置。项目根目录的 `.env` 文件会被自动加载。 + +### 配置分类 + +| 分类 | 说明 | 变量数量 | +|------|------|---------| +| 核心配置 | 服务运行必需的基础配置 | 4 | +| 数据库配置 | PostgreSQL 连接与池化参数 | 4 | +| 缓存配置 | Redis 连接与认证缓存 | 2 | +| API 密钥 | 上游 Provider 密钥 | 3+ | +| 性能调优 | 并发控制与资源限制 | 11 | +| 功能开关 | 可选功能启用/禁用 | 4 | +| 监控日志 | 日志级别与指标采集 | 2 | + +--- + +## 核心配置 + +### PORT + +HTTP 服务器监听端口。 + +```bash +PORT=8080 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `8080` | +| 必需 | 否 | +| 示例 | `8080`, `3000`, `443` | + +--- + +### LOG_LEVEL + +日志输出级别。 + +```bash +LOG_LEVEL=info +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `info` | +| 可选值 | `debug`, `info`, `warn`, `error` | +| 生产环境推荐 | `warn` 或 `error` | + +**级别说明**: + +- `debug`: 输出所有日志,包括详细的请求/响应信息 +- `info`: 输出常规运行日志(默认) +- `warn`: 仅输出警告和错误日志 +- `error`: 仅输出错误日志 + +--- + +## 数据库配置 + +### DATABASE_URL + +PostgreSQL 数据库连接字符串。 + +```bash +DATABASE_URL=postgres://user:password@host:5432/dbname?sslmode=disable +``` + +| 属性 | 值 | +|------|-----| +| 必需 | 是 | +| 格式 | PostgreSQL 连接字符串 | + +**连接字符串格式**: + +``` +postgres://用户名:密码@主机:端口/数据库名?参数 +``` + +**常用参数**: + +| 参数 | 说明 | 推荐值 | +|------|------|--------| +| `sslmode` | SSL 模式 | `disable` (本地), `require` (生产) | +| `connect_timeout` | 连接超时 (秒) | `10` | +| `statement_timeout` | 语句超时 (毫秒) | `30000` | + +**生产环境示例**: + +```bash +DATABASE_URL=postgres://tokenrouter:secure_password@db.example.com:5432/tokenrouter?sslmode=require&connect_timeout=10 +``` + +--- + +### DB_MAX_OPEN_CONNS + +数据库最大连接数。 + +```bash +DB_MAX_OPEN_CONNS=100 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `100` | +| 推荐值 | CPU 核心数 × 2 ~ 100 | + +--- + +### DB_MAX_IDLE_CONNS + +数据库最大空闲连接数。 + +```bash +DB_MAX_IDLE_CONNS=10 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10` | +| 推荐值 | `DB_MAX_OPEN_CONNS` 的 10-25% | + +--- + +### DB_CONN_MAX_LIFETIME + +数据库连接最大存活时间。 + +```bash +DB_CONN_MAX_LIFETIME=1h +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `1h` | +| 格式 | Go 时间Duration (s, m, h) | + +--- + +## 缓存配置 + +### REDIS_URL + +Redis 连接字符串。 + +```bash +REDIS_URL=redis://localhost:6379/0 +``` + +| 属性 | 值 | +|------|-----| +| 必需 | 是 | +| 格式 | Redis 连接字符串 | + +**连接字符串格式**: + +``` +redis://用户名:密码@主机:端口/数据库号 +``` + +**带认证的示例**: + +```bash +REDIS_URL=redis://default:password@redis.example.com:6379/0 +``` + +--- + +### AUTH_CACHE_TTL + +API Key 认证缓存存活时间。 + +```bash +AUTH_CACHE_TTL=5m +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `5m` | +| 格式 | Go 时间Duration (s, m, h) | +| 说明 | 认证成功结果的缓存时间 | + +--- + +## API 密钥配置 + +### DEEPSEEK_API_KEY + +DeepSeek API 密钥。 + +```bash +DEEPSEEK_API_KEY=sk-xxx +``` + +| 属性 | 值 | +|------|-----| +| 必需 | 是(MVP 阶段) | +| 获取方式 | [DeepSeek 平台](https://platform.deepseek.com/) | + +--- + +### OPENAI_API_KEY + +OpenAI API 密钥(预留)。 + +```bash +OPENAI_API_KEY=sk-xxx +``` + +| 属性 | 值 | +|------|-----| +| 必需 | 否(Phase 2) | +| 获取方式 | [OpenAI Platform](https://platform.openai.com/) | + +--- + +### ANTHROPIC_API_KEY + +Anthropic API 密钥(预留)。 + +```bash +ANTHROPIC_API_KEY=sk-ant-xxx +``` + +| 属性 | 值 | +|------|-----| +| 必需 | 否(Phase 2) | +| 获取方式 | [Anthropic Console](https://console.anthropic.com/) | + +--- + +## 性能调优 + +### 并发控制 + +#### GLOBAL_CONCURRENT_LIMIT + +全局并发请求限制。 + +```bash +GLOBAL_CONCURRENT_LIMIT=100000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `100000` | +| 说明 | 防止系统过载的全局限流 | + +--- + +#### STREAM_CONCURRENT_LIMIT + +流式请求并发限制。 + +```bash +STREAM_CONCURRENT_LIMIT=60000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `60000` | +| 说明 | 控制 SSE 流式请求的并发数 | + +--- + +#### NON_STREAM_CONCURRENT_LIMIT + +非流式请求并发限制。 + +```bash +NON_STREAM_CONCURRENT_LIMIT=40000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `40000` | +| 说明 | 控制普通 HTTP 请求的并发数 | + +--- + +#### PROVIDER_CONCURRENT_LIMIT + +上游 Provider 并发限制。 + +```bash +PROVIDER_CONCURRENT_LIMIT=10000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10000` | +| 说明 | 控制同时发往上流的请求数 | + +--- + +### 代理连接池 + +#### PROXY_MAX_IDLE_CONNS + +代理最大空闲连接数。 + +```bash +PROXY_MAX_IDLE_CONNS=10000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10000` | + +--- + +#### PROXY_MAX_IDLE_CONNS_PER_HOST + +每个主机最大空闲连接数。 + +```bash +PROXY_MAX_IDLE_CONNS_PER_HOST=1000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `1000` | + +--- + +#### PROXY_MAX_CONNS_PER_HOST + +每个主机最大连接数。 + +```bash +PROXY_MAX_CONNS_PER_HOST=10000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10000` | + +--- + +#### PROXY_IDLE_CONN_TIMEOUT + +空闲连接超时时间。 + +```bash +PROXY_IDLE_CONN_TIMEOUT=90s +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `90s` | +| 说明 | 空闲连接在池中的保留时间 | + +--- + +#### PROXY_TLS_HANDSHAKE_TIMEOUT + +TLS 握手超时时间。 + +```bash +PROXY_TLS_HANDSHAKE_TIMEOUT=10s +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10s` | + +--- + +#### PROXY_RESPONSE_HEADER_TIMEOUT + +响应头超时时间。 + +```bash +PROXY_RESPONSE_HEADER_TIMEOUT=60s +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `60s` | +| 说明 | 等待上游响应头的最长时间 | + +--- + +## 使用量统计配置 + +### USAGE_ASYNC_ENABLED + +启用异步使用量统计。 + +```bash +USAGE_ASYNC_ENABLED=true +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `true` | +| 说明 | 异步写入使用量记录,减少请求延迟 | + +--- + +### USAGE_QUEUE_SIZE + +使用量统计队列大小。 + +```bash +USAGE_QUEUE_SIZE=10000 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `10000` | +| 说明 | 异步队列最大缓冲记录数 | + +--- + +### USAGE_BATCH_SIZE + +使用量统计批量写入大小。 + +```bash +USAGE_BATCH_SIZE=100 +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `100` | +| 说明 | 每批写入数据库的记录数 | + +--- + +### USAGE_FLUSH_INTERVAL + +使用量统计刷新间隔。 + +```bash +USAGE_FLUSH_INTERVAL=1s +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `1s` | +| 说明 | 队列刷新的时间间隔 | + +--- + +## 功能开关 + +### CACHE_INJECT_ENABLED + +启用缓存注入功能。 + +```bash +CACHE_INJECT_ENABLED=true +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `true` | +| 说明 | 向请求注入厂商缓存控制指令 | + +--- + +### DEDUP_ENABLED + +启用请求去重功能。 + +```bash +DEDUP_ENABLED=true +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `true` | +| 说明 | 对相同的非流式请求去重 | + +--- + +### DEDUP_TTL + +请求去重存活时间。 + +```bash +DEDUP_TTL=2m +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `2m` | +| 格式 | Go 时间Duration (s, m, h) | +| 说明 | 在途请求记录的最大保留时间 | + +--- + +### RATE_LIMIT_ENABLED + +启用速率限制功能。 + +```bash +RATE_LIMIT_ENABLED=true +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `true` | +| 说明 | 基于用户/IP的访问频率限制 | + +--- + +### TOOL_SORT_ENABLED + +启用工具排序功能。 + +```bash +TOOL_SORT_ENABLED=true +``` + +| 属性 | 值 | +|------|-----| +| 默认值 | `true` | +| 说明 | 工具定义按字母序排序以优化缓存 | + +--- + +## 监控与日志 + +TokenRouter 内置 Prometheus 指标导出。 + +### 访问端点 + +- **Metrics**: http://localhost:8080/metrics +- **Health**: http://localhost:8080/health + +### 核心指标 + +| 指标名称 | 类型 | 说明 | +|---------|------|------| +| `http_requests_total` | Counter | HTTP 请求总数 | +| `http_request_duration_seconds` | Histogram | 请求延迟分布 | +| `cache_hit_total` | Counter | 缓存命中次数 | +| `dedup_saved_total` | Counter | 去重节省的请求数 | +| `upstream_requests_total` | Counter | 发往上流的请求数 | +| `billing_tokens_total` | Counter | 计费 token 统计 | + +--- + +## 配置最佳实践 + +### 开发环境 + +```bash +# 基础配置 +PORT=8080 +LOG_LEVEL=debug +DATABASE_URL=postgres://tokenrouter:tokenrouter@localhost:5432/tokenrouter?sslmode=disable +REDIS_URL=redis://localhost:6379/0 + +# 开发友好 +DEDUP_ENABLED=true +CACHE_INJECT_ENABLED=true +RATE_LIMIT_ENABLED=false # 开发时禁用限流 +``` + +### 生产环境(小规模) + +```bash +# 基础配置 +PORT=8080 +LOG_LEVEL=warn +DATABASE_URL=postgres://user:pass@db.example.com:5432/tokenrouter?sslmode=require +REDIS_URL=redis://redis.example.com:6379/0 + +# 性能配置 +DB_MAX_OPEN_CONNS=50 +DB_MAX_IDLE_CONNS=10 +DB_CONN_MAX_LIFETIME=30m + +# 安全配置 +AUTH_CACHE_TTL=5m +RATE_LIMIT_ENABLED=true +``` + +### 生产环境(高并发) + +```bash +# 基础配置 +PORT=8080 +LOG_LEVEL=error +DATABASE_URL=postgres://user:pass@db.example.com:5432/tokenrouter?sslmode=require +REDIS_URL=redis://redis-cluster.example.com:6379/0 + +# 高并发配置 +GLOBAL_CONCURRENT_LIMIT=10000 +STREAM_CONCURRENT_LIMIT=6000 +NON_STREAM_CONCURRENT_LIMIT=4000 +PROVIDER_CONCURRENT_LIMIT=1000 + +DB_MAX_OPEN_CONNS=100 +DB_MAX_IDLE_CONNS=25 +DB_CONN_MAX_LIFETIME=1h + +# 连接池优化 +PROXY_MAX_IDLE_CONNS=10000 +PROXY_MAX_IDLE_CONNS_PER_HOST=1000 +PROXY_MAX_CONNS_PER_HOST=10000 +PROXY_IDLE_CONN_TIMEOUT=90s + +# 使用量统计优化 +USAGE_ASYNC_ENABLED=true +USAGE_QUEUE_SIZE=10000 +USAGE_BATCH_SIZE=100 +USAGE_FLUSH_INTERVAL=1s +``` + +--- + +## 配置检查清单 + +部署前请确认: + +- [ ] DATABASE_URL 格式正确且数据库可访问 +- [ ] REDIS_URL 格式正确且 Redis 可访问 +- [ ] 至少配置了一个上游 API Key +- [ ] 生产环境 LOG_LEVEL 设置为 `warn` 或 `error` +- [ ] 生产环境数据库连接启用 SSL (`sslmode=require`) +- [ ] 配置了合理的并发限制 +- [ ] 备份了 .env 文件(但不要提交到 Git!) + +--- + +## 下一步 + +- [安装指南](installation.md) - 详细安装步骤 +- [快速开始](quickstart.md) - 开发环境搭建 +- [系统技术实现](../modules/system-implementation.md) - 完整技术架构 diff --git a/docs/guides/installation.md b/docs/guides/installation.md new file mode 100644 index 0000000..f478b79 --- /dev/null +++ b/docs/guides/installation.md @@ -0,0 +1,383 @@ +# 安装指南 + +> 最后更新:2026-04-30 + +[返回文档索引](../README.md) + +--- + +## 目录 + +- [系统要求](#系统要求) +- [Docker 安装(推荐)](#docker-安装推荐) +- [源码安装](#源码安装) +- [生产环境部署](#生产环境部署) +- [常见问题](#常见问题) + +--- + +## 系统要求 + +### 最低配置 + +| 组件 | 要求 | +|------|------| +| CPU | 2 核心 | +| 内存 | 2 GB | +| 磁盘 | 10 GB | +| 网络 | 100 Mbps | + +### 推荐配置(1000+ QPS) + +| 组件 | 要求 | +|------|------| +| CPU | 8 核心 | +| 内存 | 16 GB | +| 磁盘 | 100 GB SSD | +| 网络 | 1 Gbps | + +### 软件依赖 + +| 软件 | 版本 | 用途 | +|------|------|------| +| Go | 1.21+ | 后端运行环境 | +| Docker | 24.0+ | 容器化部署 | +| Docker Compose | 2.20+ | 服务编排 | +| PostgreSQL | 16 | 关系型数据库 | +| Redis | 7 | 缓存与消息队列 | + +--- + +## Docker 安装(推荐) + +### 1. 克隆仓库 + +```bash +git clone https://github.com/GouBuliya/TokenRouter.git +cd TokenRouter/deployments +``` + +### 2. 配置环境变量 + +```bash +# 复制环境变量示例文件 +cp ../.env.example .env + +# 编辑 .env 文件,填入必要的配置 +# 必需配置项: +# - DATABASE_URL +# - REDIS_URL +# - DEEPSEEK_API_KEY +``` + +### 3. 启动服务 + +```bash +# 启动所有服务(PostgreSQL, Redis, TokenRouter, Prometheus, Grafana) +docker compose up -d + +# 查看服务状态 +docker compose ps + +# 查看日志 +docker compose logs -f tokenrouter +``` + +### 4. 验证安装 + +```bash +# 检查 API 健康 +curl http://localhost:8080/health + +# 预期输出: +# {"status":"ok","timestamp":"2026-04-30T12:00:00Z"} +``` + +### 5. 访问监控面板 + +- **Grafana**: http://localhost:3000 (默认账号:admin/admin) +- **Prometheus**: http://localhost:9090 + +--- + +## 源码安装 + +### 1. 安装 Go + +```bash +# macOS (使用 Homebrew) +brew install go@1.21 + +# Ubuntu/Debian +wget https://go.dev/dl/go1.21.0.linux-amd64.tar.gz +sudo tar -C /usr/local -xzf go1.21.0.linux-amd64.tar.gz +echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc +source ~/.bashrc + +# 验证安装 +go version +``` + +### 2. 克隆项目 + +```bash +git clone https://github.com/GouBuliya/TokenRouter.git +cd TokenRouter +``` + +### 3. 安装依赖 + +```bash +# 下载 Go 模块依赖 +go mod download + +# 安装开发工具(golangci-lint 等) +make install-tools +``` + +### 4. 配置数据库 + +```bash +# 使用 Docker 快速启动 PostgreSQL 和 Redis +docker run -d --name postgres \ + -e POSTGRES_USER=tokenrouter \ + -e POSTGRES_PASSWORD=password \ + -e POSTGRES_DB=tokenrouter \ + -p 5432:5432 \ + postgres:16 + +docker run -d --name redis \ + -p 6379:6379 \ + redis:7 + +# 或使用 Docker Compose 启动依赖服务 +make docker-up +``` + +### 5. 配置环境变量 + +```bash +cp .env.example .env +``` + +编辑 `.env` 文件: + +```bash +# 数据库配置 +DATABASE_URL=postgres://tokenrouter:password@localhost:5432/tokenrouter?sslmode=disable +REDIS_URL=redis://localhost:6379/0 + +# 服务配置 +PORT=8080 +LOG_LEVEL=info + +# API Keys +DEEPSEEK_API_KEY=sk-your-api-key-here +``` + +### 6. 运行数据库迁移 + +```bash +# 方式 1:使用 make 命令 +make migrate + +# 方式 2:使用 golang-migrate +migrate -path migrations -database "$DATABASE_URL" up +``` + +### 7. 启动服务 + +```bash +# 开发模式(自动重新编译) +make dev + +# 或直接运行 +go run cmd/server/main.go + +# 生产模式 +make build +./tokenrouter +``` + +### 8. 验证安装 + +```bash +# 健康检查 +curl http://localhost:8080/health + +# 创建测试 API Key +curl -X POST http://localhost:8080/admin/api-keys \ + -H "Content-Type: application/json" \ + -d '{ + "name": "test-key", + "quota_usd": 100 + }' +``` + +--- + +## 生产环境部署 + +### 环境变量配置 + +生产环境必须配置以下环境变量: + +```bash +# 必需配置 +DATABASE_URL=postgres://user:password@host:5432/dbname?sslmode=require +REDIS_URL=redis://host:6379/0 + +# 安全配置 +LOG_LEVEL=warn # 生产环境使用 warn 或 error +PORT=8080 + +# 性能调优 +GLOBAL_CONCURRENT_LIMIT=1000 +STREAM_CONCURRENT_LIMIT=500 +NON_STREAM_CONCURRENT_LIMIT=500 +PROVIDER_CONCURRENT_LIMIT=100 + +# 数据库连接池 +DB_MAX_OPEN_CONNS=100 +DB_MAX_IDLE_CONNS=25 +DB_CONN_MAX_LIFETIME=5m + +# 代理配置 +PROXY_MAX_IDLE_CONNS=100 +PROXY_MAX_IDLE_CONNS_PER_HOST=10 +PROXY_MAX_CONNS_PER_HOST=100 +PROXY_IDLE_CONN_TIMEOUT=90s +PROXY_TLS_HANDSHAKE_TIMEOUT=10s +PROXY_RESPONSE_HEADER_TIMEOUT=30s + +# 功能开关 +CACHE_INJECT_ENABLED=true +DEDUP_ENABLED=true +RATE_LIMIT_ENABLED=true +TOOL_SORT_ENABLED=true +``` + +### Docker 生产部署 + +```yaml +# docker-compose.prod.yml +version: '3.8' + +services: + tokenrouter: + image: gouBuliya/tokenrouter:latest + environment: + - DATABASE_URL=${DATABASE_URL} + - REDIS_URL=${REDIS_URL} + - DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY} + ports: + - "8080:8080" + restart: unless-stopped + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 +``` + +启动: + +```bash +docker compose -f docker-compose.prod.yml up -d +``` + +### Kubernetes 部署 + +详见 [deployments/kubernetes/](https://github.com/GouBuliya/TokenRouter/tree/main/deployments/kubernetes) 目录。 + +--- + +## 常见问题 + +### 1. 数据库连接失败 + +**错误信息**: +``` +failed to connect to database: dial tcp 127.0.0.1:5432: connect: connection refused +``` + +**解决方案**: +- 检查 PostgreSQL 是否运行:`docker ps | grep postgres` +- 验证连接字符串格式是否正确 +- 确认防火墙允许 5432 端口访问 + +### 2. Redis 连接失败 + +**错误信息**: +``` +failed to connect to redis: dial tcp 127.0.0.1:6379: connect: connection refused +``` + +**解决方案**: +- 检查 Redis 是否运行:`docker ps | grep redis` +- 验证 REDIS_URL 格式:`redis://host:port/db` + +### 3. 端口被占用 + +**错误信息**: +``` +listen tcp :8080: bind: address already in use +``` + +**解决方案**: +```bash +# 查找占用端口的进程 +lsof -i :8080 + +# 或修改 PORT 环境变量 +export PORT=8081 +``` + +### 4. 迁移失败 + +**错误信息**: +``` +migration failed: relation "users" already exists +``` + +**解决方案**: +```bash +# 检查迁移状态 +migrate -path migrations -database "$DATABASE_URL" version + +# 强制重置迁移(谨慎操作,会删除数据) +migrate -path migrations -database "$DATABASE_URL" force 0 +migrate -path migrations -database "$DATABASE_URL" up +``` + +### 5. API Key 无效 + +**错误信息**: +``` +{"error": "invalid API key"} +``` + +**解决方案**: +- 确认使用完整的 API Key(包含 `sk-tr-` 前缀) +- 检查 Bearer Token 格式:`Authorization: Bearer sk-tr-xxx` +- 验证 API Key 是否已过期或被删除 + +### 6. 上游 API 调用失败 + +**错误信息**: +``` +failed to call upstream: 401 Unauthorized +``` + +**解决方案**: +- 检查 .env 中的上游 API Key 是否正确 +- 验证上游服务是否可用 +- 查看日志获取详细错误信息 + +--- + +## 下一步 + +- [配置指南](configuration.md) - 详细配置项说明 +- [快速开始](quickstart.md) - 开发环境搭建 +- [使用示例](../README.md#使用示例) - API 调用示例