Add FunASR speech2text plugin by LauraGPT · Pull Request #3281 · langgenius/dify-official-plugins

LauraGPT · 2026-06-12T09:41:09Z

Summary

Add FunASR speech-to-text plugin for Dify.

FunASR is an open-source ASR toolkit from Alibaba DAMO Academy (17.7K+ GitHub stars, Apache 2.0).

Why FunASR?

170x faster than Whisper — Paraformer RTF 0.006, SenseVoice RTF 0.007
50+ languages with emotion detection (SenseVoice)
Chinese production with built-in punctuation (Paraformer)
Speaker diarization via CAM++ (no pyannote needed)
OpenAI-compatible API —

Models included

Model	Languages	Best for
sensevoice	50+	General multilingual, emotion detection
paraformer	zh + mixed	Chinese production with punctuation
paraformer-en	en	English
fun-asr-nano	31	Encoder+LLM architecture

Setup

Deploy FunASR server:

pip install funasr vllm fastapi uvicorn
funasr-server --device cuda --host 0.0.0.0 --port 8000

Configure plugin with server URL (e.g. http://your-server:8000).

Implementation

Extends OAICompatSpeech2TextModel (FunASR exposes OpenAI-compatible API)
Predefined models + customizable model support
Minimal dependencies: dify_plugin, httpx, openai

Links:

GitHub: https://github.com/modelscope/FunASR
PyPI: https://pypi.org/project/funasr/
Docs: https://modelscope.github.io/FunASR/

Test plan

Deploy FunASR server locally
Configure plugin in Dify with server URL
Test speech2text with all 4 predefined models
Test customizable model with custom model name
Verify error handling for unreachable server

- Open-source ASR from Alibaba DAMO Academy (17K+ stars) - 170x faster than Whisper (RTF 0.006-0.007) - 4 models: sensevoice (50+ langs), paraformer (zh), paraformer-en, fun-asr-nano - OpenAI-compatible API endpoint - Speaker diarization support (CAM++)

gemini-code-assist

Code Review

This pull request introduces the FunASR Speech Recognition Plugin, enabling open-source speech-to-text capabilities via an OpenAI-compatible API. The feedback focuses on improving robustness and compatibility: implementing direct reachability checks in validate_credentials and validate_provider_credentials to avoid failures on servers that do not support /v1/models; handling missing endpoint_url and empty api_key values properly in _compat_credentials; forwarding the user parameter in _invoke; and populating model_properties with default file upload limits and supported extensions in get_customizable_model_schema.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-12T09:43:26Z

+    def validate_credentials(self, model: str, credentials: dict) -> None:
+        compat = self._compat_credentials(credentials)
+        super().validate_credentials(model, compat)


The default OAICompatSpeech2TextModel.validate_credentials typically calls the /v1/models endpoint to verify credentials. However, FunASR servers (and many lightweight OpenAI-compatible ASR servers) often do not implement /v1/models, which would cause validation to fail with a 404 or 405 error and prevent users from adding the model. Since httpx is already included in the dependencies, we can perform a direct reachability check on the endpoint instead.

def validate_credentials(self, model: str, credentials: dict) -> None: compat = self._compat_credentials(credentials) import httpx try: # FunASR server may not support /v1/models, so we verify reachability of the endpoint directly. httpx.get(compat["endpoint_url"], timeout=5.0) except httpx.RequestError as e: raise ValueError(f"Failed to connect to FunASR server at {compat['endpoint_url']}: {e}")

gemini-code-assist · 2026-06-12T09:43:26Z

+    @classmethod
+    def _compat_credentials(cls, credentials: dict) -> dict:
+        credentials = credentials.copy()
+        base = credentials["endpoint_url"].rstrip("/").removesuffix("/v1")
+        credentials["endpoint_url"] = f"{base}/v1"
+        credentials.setdefault("api_key", "no-key")
+        return credentials


There are two potential issues in _compat_credentials:

If endpoint_url is missing or empty, accessing credentials["endpoint_url"] will raise a KeyError or AttributeError.

If the user leaves the optional API Key field blank in the UI, credentials will contain "api_key": "" (an empty string). Using setdefault will not replace this empty string with "no-key" because the key already exists in the dictionary. This can cause downstream authentication failures. We should explicitly check for falsy/empty values.

Suggested change

@classmethod

def _compat_credentials(cls, credentials: dict) -> dict:

credentials = credentials.copy()

base = credentials["endpoint_url"].rstrip("/").removesuffix("/v1")

credentials["endpoint_url"] = f"{base}/v1"

credentials.setdefault("api_key", "no-key")

return credentials

@classmethod

def _compat_credentials(cls, credentials: dict) -> dict:

credentials = credentials.copy()

endpoint_url = credentials.get("endpoint_url")

if not endpoint_url:

raise ValueError("endpoint_url is required")

base = endpoint_url.rstrip("/").removesuffix("/v1")

credentials["endpoint_url"] = f"{base}/v1"

if not credentials.get("api_key"):

credentials["api_key"] = "no-key"

return credentials

gemini-code-assist · 2026-06-12T09:43:26Z

+    def _invoke(self, model: str, credentials: dict, file: IO[bytes], user: Optional[str] = None) -> str:
+        compat = self._compat_credentials(credentials)
+        return super()._invoke(model, compat, file)


The user parameter is received in _invoke but is not forwarded to the superclass method call. Forwarding user ensures that any user-specific tracking, logging, or rate-limiting in the base class functions correctly.

Suggested change

def _invoke(self, model: str, credentials: dict, file: IO[bytes], user: Optional[str] = None) -> str:

compat = self._compat_credentials(credentials)

return super()._invoke(model, compat, file)

def _invoke(self, model: str, credentials: dict, file: IO[bytes], user: Optional[str] = None) -> str:

compat = self._compat_credentials(credentials)

return super()._invoke(model, compat, file, user)

gemini-code-assist · 2026-06-12T09:43:27Z

+    def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]:
+        return AIModelEntity(
+            model=model,
+            label=I18nObject(en_us=model, zh_hans=model),
+            fetch_from=FetchFrom.CUSTOMIZABLE_MODEL,
+            model_type=ModelType.SPEECH2TEXT,
+            model_properties={},
+            parameter_rules=[],
+        )


When adding a customizable model, leaving model_properties empty ({}) means the custom model won't have any defined file upload limits or supported file extensions. This can cause errors or unexpected behavior in Dify when users attempt to upload audio files for transcription. We should populate model_properties with the same default values used by the predefined models.

def get_customizable_model_schema(self, model: str, credentials: dict) -> Optional[AIModelEntity]: return AIModelEntity( model=model, label=I18nObject(en_us=model, zh_hans=model), fetch_from=FetchFrom.CUSTOMIZABLE_MODEL, model_type=ModelType.SPEECH2TEXT, model_properties={ "file_upload_limit": 1, "supported_file_extensions": [ "flac", "mp3", "mp4", "mpeg", "mpga", "m4a", "ogg", "wav", "webm", ], }, parameter_rules=[], )

gemini-code-assist · 2026-06-12T09:43:27Z

+    def validate_provider_credentials(self, credentials: dict) -> None:
+        pass


Leaving validate_provider_credentials as a pass means that any invalid or unreachable URL configured at the provider level will be accepted without immediate feedback. Implementing a quick reachability check using httpx ensures that configuration errors are caught early.

Suggested change

def validate_provider_credentials(self, credentials: dict) -> None:

pass

def validate_provider_credentials(self, credentials: dict) -> None:

endpoint_url = credentials.get("endpoint_url")

if not endpoint_url:

raise ValueError("Server URL (endpoint_url) is required")

import httpx

try:

httpx.get(endpoint_url, timeout=5.0)

except httpx.RequestError as e:

raise ValueError(f"Failed to connect to FunASR server at {endpoint_url}: {e}")

Add FunASR speech2text plugin

11c89d4

- Open-source ASR from Alibaba DAMO Academy (17K+ stars) - 170x faster than Whisper (RTF 0.006-0.007) - 4 models: sensevoice (50+ langs), paraformer (zh), paraformer-en, fun-asr-nano - OpenAI-compatible API endpoint - Speaker diarization support (CAM++)

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Jun 12, 2026

LauraGPT had a problem deploying to models/funasr June 12, 2026 09:41 — with GitHub Actions Failure

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FunASR speech2text plugin#3281

Add FunASR speech2text plugin#3281
LauraGPT wants to merge 1 commit into
langgenius:mainfrom
LauraGPT:add-funasr-speech2text-plugin

LauraGPT commented Jun 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		def validate_provider_credentials(self, credentials: dict) -> None:
		pass

-    def validate_provider_credentials(self, credentials: dict) -> None:
-        pass
+    def validate_provider_credentials(self, credentials: dict) -> None:
+        endpoint_url = credentials.get("endpoint_url")
+        if not endpoint_url:
+            raise ValueError("Server URL (endpoint_url) is required")
+        import httpx
+        try:
+            httpx.get(endpoint_url, timeout=5.0)
+        except httpx.RequestError as e:
+            raise ValueError(f"Failed to connect to FunASR server at {endpoint_url}: {e}")

Conversation

LauraGPT commented Jun 12, 2026

Summary

Why FunASR?

Models included

Setup

Implementation

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant