fix: make embedding batch size configurable#696
Conversation
WalkthroughThis PR adds a configurable ChangesEmbedding batch size configuration
🎯 2 (Simple) | ⏱️ ~10 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/embedding_client.py (1)
185-185: ⚡ Quick winUpdate comment to reflect configurable batch size.
The comment references "max 2048 embeddings per request", but batch size is now configurable per this PR. Consider updating to something more generic like "Create batches that fit configured API limits" to avoid confusion.
📝 Suggested comment update
- # 2. Create batches that fit API limits (max 2048 embeddings per request, max 300,000 tokens per request) + # 2. Create batches that fit API limits (batch size and token limits)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/embedding_client.py` at line 185, Update the inline comment that currently says "max 2048 embeddings per request" to a generic note reflecting that batch size is configurable, e.g., "Create batches that fit configured API limits (e.g., max embeddings per request and max tokens per request)"; make this change next to the batching logic that uses the batch_size configuration variable (and related max token limit variable) so the comment matches the runtime-configurable behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/embedding_client.py`:
- Line 185: Update the inline comment that currently says "max 2048 embeddings
per request" to a generic note reflecting that batch size is configurable, e.g.,
"Create batches that fit configured API limits (e.g., max embeddings per request
and max tokens per request)"; make this change next to the batching logic that
uses the batch_size configuration variable (and related max token limit
variable) so the comment matches the runtime-configurable behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 612c6ca2-cb97-4ac2-8bdf-1ad34326c4e0
📒 Files selected for processing (6)
.env.templateconfig.toml.exampledocs/v3/contributing/configuration.mdxsrc/config.pysrc/embedding_client.pytests/llm/test_embedding_client.py
|
Amazing, thank you! |
|
Thanks! Glad this was useful. |
Summary
max_batch_sizeto embedding model config and runtime embedding configFixes #687.
Verification
uv run ruff check src\config.py src\embedding_client.py tests\llm\test_embedding_client.pyuv run ruff format --check src\config.py src\embedding_client.py tests\llm\test_embedding_client.pyuv run basedpyright src\config.py src\embedding_client.py tests\llm\test_embedding_client.pymax_batch_size=2splitssimple_batch_embed(["a", "b", "c"])into two OpenAI embedding calls and that embedding settings parsemax_batch_size=10uv run pytest tests\llm\test_embedding_client.py -qis blocked in this Windows environment before the test file runs because the repo-leveltests/conftest.pyimports app startup, which importssrc.telemetry.reasoning_traces, which imports Unix-onlyfcntl.Summary by CodeRabbit
New Features
max_batch_sizeconfiguration for embedding providers, enabling customization of per-request input limits. Defaults to provider-specific values (e.g., 2048 for OpenAI, 100 for Gemini).Documentation