Skip to content

Add supervisor cache#12

Open
xabiandrade-db wants to merge 10 commits into
databricks-solutions:mainfrom
xabiandrade-db:xabi/faq-caching
Open

Add supervisor cache#12
xabiandrade-db wants to merge 10 commits into
databricks-solutions:mainfrom
xabiandrade-db:xabi/faq-caching

Conversation

@xabiandrade-db
Copy link
Copy Markdown

Summary

Implements a routing cache for the Supervisor agent, relying on VS. This implementation caches routing decisions (which agent to use) rather than complete responses. Queries like "What is my bill today?" or "What is my account status?" contain user-specific and time-sensitive data that requires tool calling.

Changes

Cache Manager (telco_support_agent/cache/manager.py)

Implemented a new CacheManager class that handles routing cache operations:

  • Cache lookup with similarity validation: get_cache() vector similarity search and validates results against a configurable threshold. Only returns cached routing decisions when the similarity score meets or exceeds the threshold.

  • Cache writing: put_cache() and add_to_cache_async() methods write new routing decisions to the cache table. The async variant uses background threads to avoid blocking the routing flow.

Supervisor Agent Integration (telco_support_agent/agents/supervisor.py)

Modified the supervisor's routing logic to leverage the cache:

  • Cache initialization: Supervisor instantiates CacheManager when cache is enabled in configuration (configs/agents/supervisor.yaml)

  • Routing flow with cache: The route_query() method now:

    1. Checks cache first via get_cache(query)
    2. On cache hit: Returns cached agent type immediately (skips LLM call)
    3. On cache miss: Routes via LLM, then stores the decision with add_to_cache_async()
    4. Returns tuple of (agent_type, cache_hit_boolean) for tracking
  • Configurable behavior: Cache can be toggled on/off and similarity threshold tuned via supervisor.yaml configuration

Configuration

  • Configurable similarity threshold as part of supervisor configuration (default: 0.8)

Testing

  • Added unit tests for cache config.
  • Also, in terms of validation of the actual routing mechanism, I've added a notebook notebooks/02_run_agent/test_routing_cache.py.
  • The same notebook can be used to benchmark cache performance (this can inform decisions such as when enabling the cache makes sense, depending on LLM choices).
  • Deployed the changes to a test workspace for tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant