Skip to content

feat: Add EmpirioLabs model provider plugin#3284

Open
Adam-Dalloul wants to merge 2 commits into
langgenius:mainfrom
Adam-Dalloul:feat/empiriolabs-model-provider
Open

feat: Add EmpirioLabs model provider plugin#3284
Adam-Dalloul wants to merge 2 commits into
langgenius:mainfrom
Adam-Dalloul:feat/empiriolabs-model-provider

Conversation

@Adam-Dalloul

Copy link
Copy Markdown

Summary

Adds the EmpirioLabs model provider plugin under models/empiriolabs/.

EmpirioLabs (EmpirioLabs AI) is an OpenAI-compatible API that serves frontier chat and embedding models through one endpoint, https://api.empiriolabs.ai/v1, with a public GET /v1/models catalog. This plugin mirrors the CometAPI provider plugin and extends the OpenAI-compatible base model classes (OAICompatLargeLanguageModel and the OpenAI-compatible text embedding model), so configuration, streaming, tool calling, and reasoning work the way Dify users already expect from an OpenAI-compatible provider.

It ships 7 predefined chat models plus 1 embedding model, and also supports the customizable-model option so any other slug from the EmpirioLabs catalog can be added by name:

  • LLM: qwen3-7-plus, qwen3-7-max, deepseek-v4-pro, deepseek-v4-flash, glm-5-1, kimi-k2-7-code, minimax-m3
  • Text embedding: text-embedding-v4

Provider links: website https://empiriolabs.ai, docs https://docs.empiriolabs.ai, API keys https://platform.empiriolabs.ai/dashboard/api-keys

Change Type

  • Documentation / non-plugin change
  • Non-LLM plugin (tools, extensions, datasource, etc.)
  • LLM plugin

LLM Plugin Checklist

Areas affected by this change (check all that apply)
  • Message flow (system messages, user to assistant turn-taking)
  • Tool interaction flow (multi-round usage, Agent App and Agent Node)
  • Multimodal input (images, PDFs, audio, video, etc.)
  • Structured output (JSON, XML, etc.)
  • Token consumption metrics
  • Other LLM functionality (reasoning, grounding, prompt caching, etc.)
  • New models / model parameter fixes

Version

  • Top-level version in manifest.yaml is 0.0.1 (new plugin)
  • dify_plugin is declared in pyproject.toml and locked in uv.lock

Testing

  • Validated locally: every model YAML parses against dify_plugin.AIModelEntity, the provider config parses against ProviderEntity, all Python modules compile, and the predefined globs, model sources, position file, and icon assets all resolve.

EmpirioLabs is an OpenAI-compatible API that serves frontier chat and
embedding models through one endpoint (https://api.empiriolabs.ai/v1)
and a public GET /v1/models catalog. This plugin mirrors the CometAPI
provider plugin and extends the OpenAI-compatible base model classes.

Ships 7 chat models (qwen3-7-plus, qwen3-7-max, deepseek-v4-pro,
deepseek-v4-flash, glm-5-1, kimi-k2-7-code, minimax-m3) plus the
text-embedding-v4 embedding model. Supports predefined and
customizable model configuration, streaming, tool calling, and
reasoning where the model supports it.

Co-Authored-By: Claude <noreply@anthropic.com>
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jun 13, 2026
@Adam-Dalloul Adam-Dalloul temporarily deployed to models/empiriolabs June 13, 2026 03:07 — with GitHub Actions Inactive

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the EmpirioLabs model provider plugin for Dify, adding support for various frontier LLMs and text embedding models. The review feedback highlights several critical improvement opportunities: copying model_parameters before mutation to prevent side effects, aligning the thinking_budget parameter name with the configuration, decoding token slices back to strings before invoking the embedding API to ensure compatibility, renaming a copy-pasted class name, and adding defensive checks to prevent potential AttributeError and KeyError exceptions when handling responses and credentials.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +77 to +82
embeddings_batch, embedding_used_tokens = self._embedding_invoke(
model=model,
client=client,
texts=tokens[i : i + max_chunks],
extra_model_kwargs=extra_model_kwargs,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Passing raw token lists (list[list[int]]) directly to the embedding API violates the Union[list[str], str] type hint and may cause 400 Bad Request errors on OpenAI-compatible endpoints that do not support token ID inputs. Decoding the token slices back to strings before invoking the API ensures maximum compatibility and type safety.

Suggested change
embeddings_batch, embedding_used_tokens = self._embedding_invoke(
model=model,
client=client,
texts=tokens[i : i + max_chunks],
extra_model_kwargs=extra_model_kwargs,
)
chunk_texts = [enc.decode(t) for t in tokens[i : i + max_chunks]]
embeddings_batch, embedding_used_tokens = self._embedding_invoke(
model=model,
client=client,
texts=chunk_texts,
extra_model_kwargs=extra_model_kwargs,
)

stream: bool = True,
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
self._update_credential(model, credentials)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The model_parameters dictionary is mutated inside _invoke by popping elements. Since dictionaries are passed by reference in Python, this can lead to unexpected side effects if the caller reuses the same parameters (e.g., during retries or multi-agent execution). Copying the dictionary first prevents these side effects.

Suggested change
self._update_credential(model, credentials)
self._update_credential(model, credentials)
model_parameters = model_parameters.copy()

self._update_credential(model, credentials)
# reasoning
reasoning_params = {}
reasoning_budget = model_parameters.pop('reasoning_budget', None)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The parameter name defined in glm-5-1.yaml is thinking_budget, but llm.py pops reasoning_budget. This mismatch means thinking_budget will not be popped and will be sent directly to the API as an unknown top-level parameter, potentially causing a 400 Bad Request error. Popping both ensures compatibility.

Suggested change
reasoning_budget = model_parameters.pop('reasoning_budget', None)
reasoning_budget = model_parameters.pop('thinking_budget', None)
if reasoning_budget is None:
reasoning_budget = model_parameters.pop('reasoning_budget', None)

from ..common_openai import _CommonOpenAI


class OpenAITextEmbeddingModel(_CommonOpenAI, TextEmbeddingModel):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The class name OpenAITextEmbeddingModel appears to be a copy-paste leftover from the OpenAI plugin. It should be renamed to EmpirioLabsTextEmbeddingModel to maintain consistency and avoid potential confusion or naming conflicts.

Suggested change
class OpenAITextEmbeddingModel(_CommonOpenAI, TextEmbeddingModel):
class EmpirioLabsTextEmbeddingModel(_CommonOpenAI, TextEmbeddingModel):

Comment on lines +115 to +120
usage=self._calc_response_usage(
model=model,
credentials=credentials,
prompt_tokens=resp.usage.prompt_tokens,
completion_tokens=resp.usage.completion_tokens,
),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If resp.usage is None (e.g., due to an empty response or API error), accessing resp.usage.prompt_tokens will raise an AttributeError. Adding a defensive check prevents potential crashes.

Suggested change
usage=self._calc_response_usage(
model=model,
credentials=credentials,
prompt_tokens=resp.usage.prompt_tokens,
completion_tokens=resp.usage.completion_tokens,
),
usage=self._calc_response_usage(
model=model,
credentials=credentials,
prompt_tokens=resp.usage.prompt_tokens if resp.usage else 0,
completion_tokens=resp.usage.completion_tokens if resp.usage else 0,
),

Comment on lines +17 to +18
try:
model_instance = self.get_model_instance(ModelType.LLM)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If credentials is empty or missing the api_key, a KeyError will be raised during validation, resulting in an unhandled traceback and a 500 error. Adding a defensive check and raising CredentialsValidateFailedError provides a clean, user-friendly error message.

Suggested change
try:
model_instance = self.get_model_instance(ModelType.LLM)
if not credentials or not credentials.get("api_key"):
raise CredentialsValidateFailedError("API key is required")
try:
model_instance = self.get_model_instance(ModelType.LLM)

Comment on lines +17 to +21
credentials_kwargs = {
"api_key": credentials['api_key'],
"timeout": Timeout(315.0, read=300.0, write=10.0, connect=5.0),
"max_retries": 1,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing credentials['api_key'] directly can raise a KeyError if the key is missing. Using .get() is safer and adheres to defensive programming best practices.

Suggested change
credentials_kwargs = {
"api_key": credentials['api_key'],
"timeout": Timeout(315.0, read=300.0, write=10.0, connect=5.0),
"max_retries": 1,
}
credentials_kwargs = {
"api_key": credentials.get('api_key', ''),
"timeout": Timeout(315.0, read=300.0, write=10.0, connect=5.0),
"max_retries": 1,
}

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 14, 2026
@Adam-Dalloul Adam-Dalloul deployed to models/empiriolabs June 14, 2026 00:47 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant