Skip to content

feat(py): integrate DAP into generate call, expand wildcards#5075

Open
huangjeff5 wants to merge 3 commits intojh-dap-pr1from
jh-dap-pr2
Open

feat(py): integrate DAP into generate call, expand wildcards#5075
huangjeff5 wants to merge 3 commits intojh-dap-pr1from
jh-dap-pr2

Conversation

@huangjeff5
Copy link
Copy Markdown
Contributor

@huangjeff5 huangjeff5 commented Apr 8, 2026

Summary

  • Child registry per generate: _generate_action creates an effective_registry (child of the root registry) at the start of each generate call. Tool resolution and resource lookup use this child so DAP-resolved tools are scoped to the call and do not pollute the root registry.
  • Wildcard tool expansion: expand_wildcard_tools() expands patterns such as provider:tool/* and provider:tool/prefix* into concrete tool names via DynamicActionProvider.list_action_metadata. Expansion runs before parameter resolution so the model receives a full ToolDefinition list.
  • Tool execution: resolve_tool_requests receives effective_registry so execution uses the child’s resolved action cache.

@huangjeff5 huangjeff5 changed the base branch from main to jh-dap-pr1 April 8, 2026 16:26
@github-actions github-actions bot added the python Python label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a child registry pattern to Genkit, allowing for ephemeral, scoped registries that delegate lookups to a parent. It also adds support for expanding wildcard tool names (e.g., provider:tool/*) via dynamic action providers within the generate loop. I have reviewed the implementation and provided feedback regarding potential performance optimizations for wildcard name checking and registry traversal.

I am having trouble creating individual review comments. Click here to see my feedback.

py/packages/genkit/src/genkit/_ai/_generate.py (73)

medium

The condition ':' not in name or not name.endswith('*') is slightly inefficient because ':' not in name is checked for every iteration, even if the name is a simple tool name. Consider checking for the wildcard suffix first, as it is more specific.

py/packages/genkit/src/genkit/_core/_registry.py (619-622)

medium

The loop for parent_meta in await self._parent.list_actions(allowed_kinds): is called inside list_actions. If the registry hierarchy is deep, this could lead to redundant recursive calls. Consider if caching or a different traversal strategy is needed for performance.

Comment thread py/packages/genkit/tests/genkit/ai/dynamic_tools_generate_test.py Outdated
@huangjeff5 huangjeff5 requested a review from pavelgj April 8, 2026 18:23
@huangjeff5 huangjeff5 marked this pull request as ready for review April 8, 2026 18:23
@huangjeff5 huangjeff5 requested a review from ifielker April 14, 2026 20:24
logger = get_logger(__name__)


async def expand_wildcard_tools(registry: Registry, tool_names: list[str]) -> list[str]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works a bit differently from the javascript implementation and there is a critical bug here regarding how the expanded tool names are returned.

Currently, the function strips the provider prefix and appends only the unqualified tool name:

tool_name = meta.get('name')
if tool_name:
    expanded.append(str(tool_name))

This creates a serious conflict resolution issue because it drops the explicit provider binding. When the generation loop later attempts to resolve this unqualified name (e.g., 'echo'), the registry will fall back to querying all registered DAPs in registration order to find a match.

The Bug Scenario:

  1. You have two MCP servers registered: mcp1 (registered first) and mcp2 (registered second).
  2. Both servers expose a tool named 'echo'.
  3. The user explicitly requests tools from only the second server: tools=['mcp2:tool/*'].
  4. This function expands that request to just ['echo'].
  5. When the framework resolves 'echo', it asks the DAPs in order. mcp1 replies first saying "I have echo!" and its tool is executed instead of mcp2's.

The Fix: To maintain the explicit provider binding (and match how the JS implementation behaves), we should reconstruct the fully-qualified DAP key so the registry knows exactly which provider to query later on:

tool_name = meta.get('name')
if tool_name:
    expanded.append(f'/dynamic-action-provider/{provider_name}:{action_type}/{tool_name}')

This guarantees the executed tool will strictly belong to the provider the user requested.

It would also be a good idea to have a test for this scenario.

registry = Registry()
result = await expand_wildcard_tools(registry, ['my_tool', 'other_tool'])
assert result == ['my_tool', 'other_tool']

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a test that shows the bug I mentioned:

async def test_wildcard_tools_avoids_shadowing_conflict() -> None:
    """Explicit wildcard provider paths should not be shadowed by earlier providers."""
    ai = Genkit()
    pm, _ = define_programmable_model(ai)

    call_log: list[str] = []

    class Inp(BaseModel):
        x: str

    async def echo1_fn(inp: Inp) -> str:
        call_log.append('mcp1')
        return 'echo 1'

    async def echo2_fn(inp: Inp) -> str:
        call_log.append('mcp2')
        return 'echo 2'

    # Detached Actions (not registered in root registry directly)
    echo1_action = Action(name='echo', kind=ActionKind.TOOL, fn=echo1_fn, metadata={'name': 'echo'})
    echo2_action = Action(name='echo', kind=ActionKind.TOOL, fn=echo2_fn, metadata={'name': 'echo'})

    async def dap1_fn() -> DapValue:
        return {'tool': [echo1_action]}

    async def dap2_fn() -> DapValue:
        return {'tool': [echo2_action]}

    # Register mcp1 first. If resolution falls back to an unqualified lookup, mcp1 will "win".
    ai.define_dynamic_action_provider('mcp1', dap1_fn)
    ai.define_dynamic_action_provider('mcp2', dap2_fn)

    # The model calls the 'echo' tool
    pm.responses = [
        _tool_call_response('echo', {'x': 'hello'}),
        _text_response('finished'),
    ]

    response = await ai.generate(
        model='programmableModel',
        prompt='use echo',
        # Crucially, we explicitly request tools from mcp2 ONLY
        tools=['mcp2:tool/*'],
    )

    assert response.text == 'finished'
    
    # If the bug is present, this will fail because it will fall back to the unqualified 
    # global loop and find mcp1's 'echo' tool instead.
    assert call_log == ['mcp2']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants