fix: make embedding batch size configurable by pntech20 · Pull Request #696 · plastic-labs/honcho

pntech20 · 2026-05-17T03:14:14Z

Summary

add optional max_batch_size to embedding model config and runtime embedding config
use the configured batch cap when splitting OpenAI/Gemini embedding requests
document the env/TOML setting for OpenAI-compatible providers such as DashScope

Fixes #687.

Verification

uv run ruff check src\config.py src\embedding_client.py tests\llm\test_embedding_client.py
uv run ruff format --check src\config.py src\embedding_client.py tests\llm\test_embedding_client.py
uv run basedpyright src\config.py src\embedding_client.py tests\llm\test_embedding_client.py
direct runtime check that max_batch_size=2 splits simple_batch_embed(["a", "b", "c"]) into two OpenAI embedding calls and that embedding settings parse max_batch_size=10

uv run pytest tests\llm\test_embedding_client.py -q is blocked in this Windows environment before the test file runs because the repo-level tests/conftest.py imports app startup, which imports src.telemetry.reasoning_traces, which imports Unix-only fcntl.

Summary by CodeRabbit

New Features
- Added optional max_batch_size configuration for embedding providers, enabling customization of per-request input limits. Defaults to provider-specific values (e.g., 2048 for OpenAI, 100 for Gemini).
Documentation
- Updated embedding configuration guide with batch size setting examples and guidance for providers with smaller request limits.

coderabbitai · 2026-05-17T03:14:26Z

Walkthrough

This PR adds a configurable max_batch_size field to embedding model configuration, allowing per-provider batch size limits to be set. Gemini and OpenAI clients now respect this setting, with fallbacks to their respective defaults (100 and 2048). Configuration examples, documentation, and test coverage accompany the change.

Changes

Embedding batch size configuration

Layer / File(s)	Summary
Configuration field definitions `src/config.py`	`max_batch_size` field (validated as positive `int` or `None`) added to `ConfiguredEmbeddingModelSettings` and `EmbeddingModelConfig`.
Configuration examples and documentation `.env.template`, `config.toml.example`, `docs/v3/contributing/configuration.mdx`	Environment template, TOML config example, and embedding configuration documentation updated to show `max_batch_size` as an optional setting, with guidance on provider defaults and when to override (e.g., DashScope limit of 10).
Configuration resolution `src/config.py`	`resolve_embedding_model_config` propagates `configured.max_batch_size` into the resolved runtime `EmbeddingModelConfig`.
Client batching implementation `src/embedding_client.py`	Gemini (default 100) and OpenAI (default 2048) batch size initialization now honors `config.max_batch_size` when set. Client recreation signature updated to include `max_batch_size` so configuration changes trigger reinitialization.
Test coverage `tests/llm/test_embedding_client.py`	Test helper accepts configurable `max_batch_size`. New tests verify `simple_batch_embed` splits inputs per batch limit and `EmbeddingSettings` parses the environment variable. Environment cleanup extended.

🎯 2 (Simple) | ⏱️ ~10 minutes

🐰 A config field hops into place,
With batch sizes no more of one pace,
DashScope now grins,
Each embed request wins,
Ten inputs fit perfectly in space! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: make embedding batch size configurable' directly and clearly describes the main change: adding configuration support for embedding batch size.
Linked Issues check	✅ Passed	The PR implements the recommended solution from issue `#687` by exposing max_batch_size as a configurable field in embedding model config, allowing users to set provider-appropriate batch limits.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to adding max_batch_size configurability: config declarations, environment/TOML examples, documentation, runtime batching logic, and corresponding tests.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/embedding_client.py (1)
185-185: ⚡ Quick win

Update comment to reflect configurable batch size.

The comment references "max 2048 embeddings per request", but batch size is now configurable per this PR. Consider updating to something more generic like "Create batches that fit configured API limits" to avoid confusion.
📝 Suggested comment update
-        # 2. Create batches that fit API limits (max 2048 embeddings per request, max 300,000 tokens per request)
+        # 2. Create batches that fit API limits (batch size and token limits)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/embedding_client.py` at line 185, Update the inline comment that
currently says "max 2048 embeddings per request" to a generic note reflecting
that batch size is configurable, e.g., "Create batches that fit configured API
limits (e.g., max embeddings per request and max tokens per request)"; make this
change next to the batching logic that uses the batch_size configuration
variable (and related max token limit variable) so the comment matches the
runtime-configurable behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/embedding_client.py`:
- Line 185: Update the inline comment that currently says "max 2048 embeddings
per request" to a generic note reflecting that batch size is configurable, e.g.,
"Create batches that fit configured API limits (e.g., max embeddings per request
and max tokens per request)"; make this change next to the batching logic that
uses the batch_size configuration variable (and related max token limit
variable) so the comment matches the runtime-configurable behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 612c6ca2-cb97-4ac2-8bdf-1ad34326c4e0

📥 Commits

Reviewing files that changed from the base of the PR and between 8fcbb54 and b3a485c.

📒 Files selected for processing (6)

.env.template
config.toml.example
docs/v3/contributing/configuration.mdx
src/config.py
src/embedding_client.py
tests/llm/test_embedding_client.py

Omee11 · 2026-05-17T03:28:48Z

Amazing, thank you!

pntech20 · 2026-05-17T04:37:17Z

Thanks! Glad this was useful.

fix: make embedding batch size configurable

b3a485c

coderabbitai Bot reviewed May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make embedding batch size configurable#696

fix: make embedding batch size configurable#696
pntech20 wants to merge 1 commit into
plastic-labs:mainfrom
pntech20:codex/configurable-embedding-batch-size

pntech20 commented May 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 17, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Omee11 commented May 17, 2026

Uh oh!

pntech20 commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pntech20 commented May 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Omee11 commented May 17, 2026

Uh oh!

pntech20 commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pntech20 commented May 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 17, 2026 •

edited

Loading