fix: sanitize MCP tool names for LLM function-calling compatibility#93
Open
SBALAVIGNESH123 wants to merge 1 commit into
Open
Conversation
Fixes cloudshipai#42. LLMs like Gemini and OpenAI enforce [a-zA-Z0-9_]+ for tool names and silently normalize hyphens to underscores. When the LLM calls back with the normalized name, GenKit fails to resolve it against the original MCP-provided name. - Add SanitizeToolName() for centralized name sanitization - Add SanitizedMCPTool decorator implementing full ai.Tool interface - Cover both createServerClient and connectToMCPServer code paths - Fix assignment filter to match both original and sanitized names - Add comprehensive test suite (13 test cases)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey, I ran into the issue described in #42 where MCP tools with hyphens in their names crash during GenKit execution. Dug into it and found that the root cause is pretty straightforward — LLMs like Gemini and OpenAI enforce a strict regex for function names (only alphanumeric and underscores allowed), so when they see a tool like __resolve-library-id they silently call it back as __resolve_library_id. GenKit then can't find it in its registry and throws a fatal error.
The fix introduces a SanitizedMCPTool wrapper that sits at the MCP discovery boundary and normalizes tool names before they ever reach GenKit. It implements the full ai.Tool interface (all 6 methods) and delegates execution to the original tool unchanged, so the actual tool behavior is completely unaffected. I also added a centralized SanitizeToolName function since I noticed the same normalization logic was copy-pasted across several files.
One thing I caught during testing — there was a second failure mode where the tool assignment filter in the execution engine compares DB-stored names against runtime tool names. Since the DB might store the original hyphenated name but the runtime now reports the sanitized name, tools would silently get filtered out. Fixed that by registering both the original and sanitized versions in the lookup map.
Both code paths are covered (the pooled createServerClient path and the legacy connectToMCPServer fallback). I wrote 13 test cases covering name sanitization, full interface delegation, the restart name consistency, nil safety, and the assignment filter matching scenario. All existing tests pass with zero regressions.