kagent-feast-mcp: runtime monkey-patch of Feast VARCHAR limit is fragile and breaks across versions

## Summary

The `kagent-feast-mcp` ingestion pipeline uses a runtime monkey-patch to work around Feast's hardcoded `max_length=512` VARCHAR limit in its Milvus online store integration. This approach modifies installed library source code on disk during pipeline execution, which is fragile and opaque.

## Location

`kagent-feast-mcp/pipelines/kubeflow-pipeline.py` in the `store_via_feast` component:

```python
# Patch Feast VARCHAR limit (hardcoded 512 -> 4096) and reload module
import importlib
import feast.infra.online_stores.milvus_online_store.milvus as milvus_mod
src_file = inspect.getfile(milvus_mod)
with open(src_file, "r") as f:
    content = f.read()
if "max_length=512" in content:
    with open(src_file, "w") as f:
        f.write(content.replace("max_length=512", "max_length=4096"))
```

## Problem

1. **Fragile across Feast versions**: If Feast renames `max_length` or changes the file structure, the string replacement silently fails and data gets silently truncated at 512 chars.
2. **Opaque in debugging**: Modifying installed library source at runtime means `pip show feast` still reports the unpatched version. Anyone debugging the pipeline won't see the patch unless they read this specific component's source.
3. **Unnecessary for the use case**: The `kagent-feast-mcp/mcp-server/server.py` already uses `pymilvus` directly (via `MilvusClient`) to query the same Milvus instance without needing Feast at all. This is simpler, thread-safe, and avoids the VARCHAR limitation entirely.
4. **Drop-and-recreate pattern**: The `store_via_feast` component also drops the entire Milvus collection before reinserting (`utility.drop_collection`). If the pipeline fails mid-ingestion (e.g., GitHub API rate limit), the collection is empty and the agent returns nothing. This compounds the Feast layer's fragility.

## Suggested Direction

This supports the ADR-008 direction (pymilvus over Feast) that's being discussed. The MCP server already demonstrates the cleaner pattern:

- **Ingestion**: Use `pymilvus` directly with `MilvusClient.upsert()` keyed on `file_unique_id` + `chunk_index` for idempotent writes (no drop-and-recreate).
- **Serving**: Already done -- `kagent-feast-mcp/mcp-server/server.py` uses `MilvusClient` directly.
- **Mark Feast pipelines as legacy**: Keep them for reference but document that `pymilvus` direct is the production path.

PR freeze is on so just flagging this for architecture discussion. Happy to contribute to the ADR-008 doc.

Related: #181 (content truncation), #63 (model reload), #28 (connection pooling), #72 (codebase cleanup)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kagent-feast-mcp: runtime monkey-patch of Feast VARCHAR limit is fragile and breaks across versions #182

Summary

Location

Problem

Suggested Direction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kagent-feast-mcp: runtime monkey-patch of Feast VARCHAR limit is fragile and breaks across versions #182

Description

Summary

Location

Problem

Suggested Direction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions