Add intfloat/multilingual-e5-large-instruct model support by RimoGuin · Pull Request #622 · qdrant/fastembed

RimoGuin · 2026-04-06T22:18:16Z

All Submissions:

Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New models submission:

Have you added an explanation of why it's important to include this model?

Closes #140. This was previously attempted in #181 but was not completed. intfloat/multilingual-e5-large-instruct is a state-of-the-art multilingual embedding model that supports instruction-based embeddings across 100+ languages. It outperforms multilingual-e5-large on MTEB benchmarks and is widely used for multilingual retrieval tasks. Personally, multilingual-e5-large-instruct is very much better in retrieval tasks(even with other supported languages) than multilingual-e5-large.

Have you added tests for the new model? Were canonical values for tests computed via the original model?

Yes, canonical values were computed using fastembed itself (not sentence-transformers).

Have you added the code snippet for how canonical values were computed?

from fastembed import TextEmbedding
import numpy as np

model = TextEmbedding(model_name="intfloat/multilingual-e5-large-instruct")
docs = ["hello world", "flag embedding"]
embeddings = list(model.embed(docs))
vec = np.array(embeddings[0])
print("First 5 values:", vec[:5].tolist())

Have you successfully ran tests with your changes locally?

Yes, verified via a standalone script that the canonical vector matches within atol=1e-3.
cc @hh-space-invader @joein

Note: #181 previously attempted to add this model but required a manual ONNX export as official ONNX support was unavailable at the time(I believe). The ONNX model is now officially available on the model's HuggingFace page, making this a clean addition without any manual export.

coderabbitai · 2026-04-06T22:21:11Z

📝 Walkthrough

Walkthrough

This pull request adds support for a new ONNX-based text embedding model, intfloat/multilingual-e5-large-instruct, by extending the supported models registry in the FastEmbed library. The model entry specifies an embedding dimension of 1024 tokens, references the Hugging Face model source, and declares the ONNX artifact locations (model file and data file). A corresponding test entry was added to the canonical vector values dictionary to support validation testing of the new model.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly describes the main change: adding support for the intfloat/multilingual-e5-large-instruct model.
Linked Issues check	✅ Passed	The PR successfully implements the primary objective from issue `#140` by adding the intfloat/multilingual-e5-large-instruct model to the supported models with proper configuration and tests.
Out of Scope Changes check	✅ Passed	All changes are directly related to adding the new model: model entry in onnx_embedding.py and corresponding test canonical vector in test file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	The pull request description directly addresses the changeset by explaining the addition of the intfloat/multilingual-e5-large-instruct model, providing justification, test details, and verification steps.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Add intfloat/multilingual-e5-large-instruct model support

7e7d76c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add intfloat/multilingual-e5-large-instruct model support#622

Add intfloat/multilingual-e5-large-instruct model support#622
RimoGuin wants to merge 1 commit intoqdrant:mainfrom
RimoGuin:feat/add-multilingual-e5-large-instruct

RimoGuin commented Apr 6, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 6, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RimoGuin commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

All Submissions:

New models submission:

Uh oh!

coderabbitai bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RimoGuin commented Apr 6, 2026 •

edited

Loading

coderabbitai bot commented Apr 6, 2026 •

edited

Loading