Context
As we move toward implementing a more robust, deterministic Agentic RAG architecture (utilizing LangGraph for stateful semantic routing), we need to ensure the underlying folder structure supports complex AI logic without polluting the network transport layers.
Currently, the Docs Agent exposes two separate API boundaries:
The WebSocket server (server/app.py)
The REST API (server-https/app.py)
The Problem
If we place the new state-machine routing logic directly inside either of these app.py files (or loosely inside their respective directories), we will introduce severe architectural anti-patterns:
Code Duplication: The LLM routing logic, intent classification nodes, and edge definitions would need to be duplicated across both API boundaries.
Fragile Maintenance: Updating a retrieval tool or tweaking a prompt would require modifying files in multiple directories, increasing the risk of desync between the WebSocket and HTTP endpoints.
Poor Testability: Tightly coupling the AI logic to FastAPI/ASGI routes makes it incredibly difficult to write isolated unit tests for the LLM behavior without mocking the entire web server.
Proposed Solution
I propose establishing a Shared Core Architecture by introducing a dedicated core_agent/ module at the project root. This directory will act as the single source of truth for the agent's "brain," cleanly decoupling the semantic routing from the HTTP/WebSocket transport layers.
Proposed Directory Structure:
kubeflow/docs-agent/
│
├── core_agent/
│ ├── init.py
│ ├── graph.py <-- Shared LangGraph state machine and routing logic
│ └── tools.py <-- Milvus retrieval tool definitions
│
├── server/
│ └── app.py <-- Imports and invokes core_agent.graph
│
├── server-https/
│ └── app.py <-- Imports and invokes core_agent.graph
Implementation Plan
Create the Core Module: Scaffold the core_agent/ directory with a robust LangGraph state schema (graph.py) to handle intent classification and cyclic error correction.
Decouple the APIs: Update both server/app.py and server-https/app.py to import this shared graph, ensuring they only act as transport layers.
Comprehensive Testing: Introduce a fully-mocked PyTest suite (tests/test_graph.py) to validate the conditional routing edges and failure recovery loops with zero network or LLM API overhead.
Relation to Ongoing Work
This structural refactor serves as the foundational Phase 1 architecture for my GSoC proposal and is designed to integrate cleanly alongside the performance optimizations introduced in PR #129.
Context
As we move toward implementing a more robust, deterministic Agentic RAG architecture (utilizing LangGraph for stateful semantic routing), we need to ensure the underlying folder structure supports complex AI logic without polluting the network transport layers.
Currently, the Docs Agent exposes two separate API boundaries:
The WebSocket server (server/app.py)
The REST API (server-https/app.py)
The Problem
If we place the new state-machine routing logic directly inside either of these app.py files (or loosely inside their respective directories), we will introduce severe architectural anti-patterns:
Code Duplication: The LLM routing logic, intent classification nodes, and edge definitions would need to be duplicated across both API boundaries.
Fragile Maintenance: Updating a retrieval tool or tweaking a prompt would require modifying files in multiple directories, increasing the risk of desync between the WebSocket and HTTP endpoints.
Poor Testability: Tightly coupling the AI logic to FastAPI/ASGI routes makes it incredibly difficult to write isolated unit tests for the LLM behavior without mocking the entire web server.
Proposed Solution
I propose establishing a Shared Core Architecture by introducing a dedicated core_agent/ module at the project root. This directory will act as the single source of truth for the agent's "brain," cleanly decoupling the semantic routing from the HTTP/WebSocket transport layers.
Proposed Directory Structure:
kubeflow/docs-agent/
│
├── core_agent/
│ ├── init.py
│ ├── graph.py <-- Shared LangGraph state machine and routing logic
│ └── tools.py <-- Milvus retrieval tool definitions
│
├── server/
│ └── app.py <-- Imports and invokes
core_agent.graph│
├── server-https/
│ └── app.py <-- Imports and invokes
core_agent.graphImplementation Plan
Create the Core Module: Scaffold the core_agent/ directory with a robust LangGraph state schema (graph.py) to handle intent classification and cyclic error correction.
Decouple the APIs: Update both server/app.py and server-https/app.py to import this shared graph, ensuring they only act as transport layers.
Comprehensive Testing: Introduce a fully-mocked PyTest suite (tests/test_graph.py) to validate the conditional routing edges and failure recovery loops with zero network or LLM API overhead.
Relation to Ongoing Work
This structural refactor serves as the foundational Phase 1 architecture for my GSoC proposal and is designed to integrate cleanly alongside the performance optimizations introduced in PR #129.