You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AILSS currently gets most of its retrieval power from OpenAI embeddings, but the Python-first
agent backend still lacks a bounded local reasoning stage between retrieval and answer
construction.
Today, the apps/api workflow is:
retrieve -> decide -> read -> answer -> validate
retrieval can already return grounded candidates from the local SQLite + sqlite-vec index
decide_context still relies on simple heuristics
validate_answer only checks that citations exist, not that the selected evidence was the
best available match
This leaves a gap between "good enough vector retrieval" and "high-confidence grounded answer
selection". We want a local, low-cost improvement that strengthens orchestration and quality
without replacing the existing OpenAI embedding path.
Proposed solution
Add an optional local reranker stage on top of the existing OpenAI embedding retrieval path.
Scope the first implementation to the Python backend and its Obsidian-managed runtime:
Keep OpenAI embeddings as the source of truth for indexing and query embeddings
Add a small local reranker model that scores the top semantic retrieval candidates
Reorder candidates before decide_context selects final notes for reading / answer building
Record reranker usage and latency in eval and run artifacts so the effect is measurable
Design direction:
Python-first boundary:
implement reranking in apps/api first
do not require index DB schema changes for v1
keep the MCP / Node retrieval path unchanged unless the Python-first experiment proves out
Bounded local-model role:
use the local model for pairwise or query-document relevance scoring only
do not introduce a local generative answer model in this issue
fail fast if reranking is explicitly enabled but the local model cannot load
Proposed file-level plan:
apps/api/src/ailss_api/config.py
add settings for reranker enablement and runtime contract
example env vars:
AILSS_API_RERANKER_ENABLED
AILSS_API_RERANKER_MODEL
AILSS_API_RERANK_CANDIDATES
AILSS_API_RERANK_KEEP_TOP_K
apps/api/src/ailss_api/models.py
extend retrieval / agent metrics with reranker metadata
example fields:
reranker_model
reranker_latency_ms
reranker_candidates
new module: apps/api/src/ailss_api/reranking.py
define a small interface for reranking
suggested contract:
class RerankCandidate(BaseModel): path, title, snippet, evidence_text
Problem statement
AILSS currently gets most of its retrieval power from OpenAI embeddings, but the Python-first
agent backend still lacks a bounded local reasoning stage between retrieval and answer
construction.
Today, the
apps/apiworkflow is:retrieve -> decide -> read -> answer -> validatesqlite-vecindexdecide_contextstill relies on simple heuristicsvalidate_answeronly checks that citations exist, not that the selected evidence was thebest available match
This leaves a gap between "good enough vector retrieval" and "high-confidence grounded answer
selection". We want a local, low-cost improvement that strengthens orchestration and quality
without replacing the existing OpenAI embedding path.
Proposed solution
Add an optional local reranker stage on top of the existing OpenAI embedding retrieval path.
Scope the first implementation to the Python backend and its Obsidian-managed runtime:
decide_contextselects final notes for reading / answer buildingDesign direction:
apps/apifirstProposed file-level plan:
apps/api/src/ailss_api/config.pyAILSS_API_RERANKER_ENABLEDAILSS_API_RERANKER_MODELAILSS_API_RERANK_CANDIDATESAILSS_API_RERANK_KEEP_TOP_Kapps/api/src/ailss_api/models.pyreranker_modelreranker_latency_msreranker_candidatesapps/api/src/ailss_api/reranking.pyclass RerankCandidate(BaseModel): path, title, snippet, evidence_textclass RerankResult(BaseModel): path, score, rankdef rerank_candidates(query: str, candidates: list[RerankCandidate], settings: Settings) -> list[RerankResult]apps/api/src/ailss_api/retrieval.pyorapps/api/src/ailss_api/agent.pyretrieve_notes()as-is for baseline retrieval_retrieve_context()or right after retrieval results are returnedapps/api/src/ailss_api/evals.pypackages/obsidian-plugin/src/pythonApi/pythonApiServiceController.tsdocs/architecture/python-first-local-agent-backend.mddocs/03-plan.mdSuggested runtime behavior:
NcandidatesNcandidatesAcceptance direction:
Constraints / context (optional)