Skip to content

Content truncation mismatch: servers discard 60% of indexed content before LLM call #181

@JayDS22

Description

@JayDS22

The pipeline uses RecursiveCharacterTextSplitter with a default chunk_size=1000 and stores chunks as VARCHAR(2000):

# pipelines/kubeflow-pipeline.py
'content_text': chunk[:2000],

But both server/app.py and server-https/app.py truncate to 400 characters before passing results to the LLM:

if isinstance(content_text, str) and len(content_text) > 400:
    content_text = content_text[:400] + "..."

With the default chunk size of 1000 chars, the server-side truncation discards roughly 60% of each chunk before the LLM sees it. The 400-char limit is a reasonable trade-off for fitting more results into the context window, but the issue is that the MCP server in kagent-feast-mcp/mcp-server/server.py does not truncate at all. It passes full content_text to the agent.

This means the same query against the same index will produce different answer quality depending on whether it goes through Architecture A (main servers, 400 chars) or Architecture B (Kagent/MCP, full content). That makes it hard to evaluate retrieval quality consistently across the system.

Suggested approach: make the truncation limit configurable via environment variable (e.g., CONTENT_MAX_CHARS) and align the default across all server implementations so evaluation results are comparable regardless of code path.

PR freeze is on, so just flagging this as an issue. Happy to pick it up when PRs open back up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions