-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description:
This issue proposes the addition of RAG (Retrieval-Augmented Generation) to Polyglut multimodel LLM chat to enhances the app's memory with responses from local knowledge base, aligning with Polyglut's main goals of being a model experimentation tool that develops an almost human-like memory.
Motivation
- Improve response quality by grounding answers in up-to-date or domain-specific knowledge.
- Enable users to upload or reference their own documents for context-aware chat.
- Reduce hallucinations and increase trust in the model’s outputs.
Proposed Solution
Indexing: Creating a vector-based knowledge base from the project's documentation, code, and other relevant files.
Retrieving: When a user asks a question, the system will search this knowledge base for the most relevant information.
Augmenting: The retrieved information will be injected into the LLM's prompt, providing the model with real-time, specific context before it generates a response.
This will enable the LLM to answer questions about the Polyglut codebase and architecture accurately and with references.
Tasks
Task 1: Select and Set Up a Vector Store.
- Research and choose a suitable vector database (e.g., ChromaDB, FAISS, Pinecone).
- Implement the setup and configuration for the chosen vector store.
Task 2: Data Ingestion Pipeline.
- Write a script to recursively read files from the src/ directory.
- Chunk the text content into smaller, manageable pieces.
- Generate embeddings for each text chunk using a suitable embedding model.
- Store the text chunks and their corresponding embeddings in the vector store.
Task 3: Implement the Retrieval Logic.
- Create a function that takes a user's query and finds the most similar vectors in the store.
- Return the original text chunks associated with the top N most similar vectors.
Task 4: Integrate RAG into the Chat API.
- Modify the main chat function to call the retrieval logic before sending the prompt to the LLM.
- Construct the final LLM prompt by combining the original query with the retrieved context.
Task 5: Display Source Information in the UI.
- Update the front end to display the source citations or links alongside the LLM's response.
Acceptance Criteria
- Data Ingestion: A script exists to parse and embed key project documents (e.g., README.md, core source files, API docs) into a vector store.
- Contextual Retrieval: The system can programmatically retrieve the top N most relevant chunks of text from the vector store based on a user's query.
- Prompt Augmentation: The retrieved text is successfully included in the LLM prompt as contextual information.
- Correct Responses: The LLM's responses to project-specific questions (e.g., "What is the purpose of the Polyglut.js file?", "How does the data-ingestion module work?") are accurate and reference the ingested data.
- Source Citation: The generated response includes a citation or link to the original source file or document from which the information was retrieved.
Labels
enhancement, AI/ML