perf: Move SentenceTransformer to module level to prevent reload on every query by 24f3005089 · Pull Request #180 · kubeflow/docs-agent

24f3005089 · 2026-03-27T07:39:33Z

Fixes #178

Summary

This PR fixes a performance issue where the SentenceTransformer model was being reloaded from disk on every search request, causing 500ms-2s latency per call.

Changes

Moved encoder = SentenceTransformer(EMBEDDING_MODEL) initialization from inside milvus_search() function to module level
Applied fix to both server/app.py and server-https/app.py
Model is now loaded once at startup and reused for all queries

Impact

Eliminates 500ms-2s model loading overhead on every query
Particularly beneficial in agentic RAG workflows where search_kubeflow_docs may be invoked multiple times per conversation turn
Improves overall system responsiveness and reduces resource usage

Testing

Code change only - verified syntax and logic. The model initialization now happens at module import time, and the encoder variable is reused across all function calls.

…very query Signed-off-by: 24f3005089 <24f3005089@ds.study.iitm.ac.in>

google-oss-prow · 2026-03-27T07:39:38Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign franciscojavierarceo for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

24f3005089 added 2 commits March 27, 2026 12:20

docs(kagent): Add WSL2 port-forward fix to README

6951190

perf: Move SentenceTransformer to module level to prevent reload on e…

e8b3487

…very query Signed-off-by: 24f3005089 <24f3005089@ds.study.iitm.ac.in>

google-oss-prow bot requested a review from franciscojavierarceo March 27, 2026 07:39

google-oss-prow bot added the size/S label Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Move SentenceTransformer to module level to prevent reload on every query#180

perf: Move SentenceTransformer to module level to prevent reload on every query#180
24f3005089 wants to merge 2 commits intokubeflow:mainfrom
24f3005089:fix/sentence-transformer-caching

24f3005089 commented Mar 27, 2026

Uh oh!

google-oss-prow bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

24f3005089 commented Mar 27, 2026

Summary

Changes

Impact

Testing

Uh oh!

google-oss-prow bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant