#

semantic-cache

Here are 20 public repositories matching this topic...

codefuse-ai / ModelCache

A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.

llm semantic-cache

Updated Jun 30, 2025
Python

redis / redis-vl-python

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

python redis openai embedding vector-search vector-database large-language-models llm semantic-cache retrieval-augmented-generation llmcache

Updated Dec 17, 2025
Python

sensoris / semcache

Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.

gemini openai llm anthropic semantic-cache genai

Updated Jun 21, 2025
Rust

vcache-project / vCache

Reliable and Efficient Semantic Prompt Caching with vCache

Updated Dec 17, 2025
Python

redis / redis-vl-java

Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.

java redis ai embeddings vectors rag vector-search vector-database llm generative-ai semantic-cache llm-cache rag-chatbot semantic-routing agentic-ai

Updated Dec 13, 2025
Java

Harras3 / Enterprise-Grade-RAG

This is a RAG based chatbot in which semantic cache and guardrails have been incorporated.

guardrails semantic-cache retrieval-augmented-generation

Updated Nov 11, 2024
HTML

aws-samples / Reducing-Hallucinations-in-LLM-Agents-with-a-Verified-Semantic-Cache

This repository contains sample code demonstrating how to implement a verified semantic cache using Amazon Bedrock Knowledge Bases to prevent hallucinations in Large Language Model (LLM) responses while improving latency and reducing costs.

agent aws demo bedrock rag aws-blog llm semantic-cache llm-agent amazon-bedrock amazon-bedrock-agents amazon-bedrock-knowledge-bases

Updated Apr 3, 2025
Jupyter Notebook

zakariaf / RAG-Cache

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

redis embeddings openai cost-optimization rag fastapi vector-database qdrant semantic-cache llm-caching

Updated Dec 2, 2025
Python

jonathanscholtes / LLM-Performance-with-Azure-Cosmos-DB-Semantic-Cache

Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.

vector-search azurecosmosdb semantic-cache

Updated Mar 22, 2024
Python

mar1boroman / redis-movies-gen-ai

Redis Vector Similarity Search, Semantic Caching, Recommendation Systems and RAG

redis vector vector-search llm semantic-cache redis-vector-search retrieval-augmented-generation dalle-3

Updated Apr 3, 2024
Python

mar1boroman / ask-redis-blogs

A ChatBot using Redis Vector Similarity Search, which can recommend blogs based on user prompt

python redis chatbot vector-search vector-database sentence-transformers huggingface-transformers large-language-models llm generative-ai semantic-cache redis-vector-search llmcache

Updated Sep 30, 2023
Python

benitomartin / semantic-caching-qdrant-splade

Optimized RAG Retrieval with Indexing, Quantization, Hybrid Search and Caching

quantization hnsw huggingface hybrid-search large-language-models splade qdrant-vector-database semantic-cache retrieval-augmented-generation

Updated Nov 6, 2024
Python

riccardogiuriola / vecs

Ultra-fast Semantic Cache Proxy written in pure C

c high-performance embeddings vector-search llama-cpp semantic-cache llm-ops

Updated Dec 15, 2025
C

sathishastin / prompt-cache

🚀 Optimize LLM usage with PromptCache, a smart middleware that cuts costs and speeds up responses by caching repetitive queries.

shell machine-learning terminal cache chatbot openai decorator memcache correctness semantic-search similarity-search cost-optimization badgerdb stable-diffusion semantic-cache comfyui-workflow comfyui-nodes rag-ai

Updated Dec 17, 2025
Go

Fender1992 / cachegpt

cli oauth terminal ai cache gemini claude cost-optimization openai-api ai-chat llm chatgpt semantic-cache free-ai

Updated Dec 13, 2025
TypeScript

mar1boroman / text-sql-chatbot

Redis Database offers unique capability to keep your data fresh while serving through LLM chatbot

redis openai text-to-sql vector-search vector-database langchain semantic-cache langchain-python langsmith redisvl

Updated Jul 16, 2024
Python

semioz / simcache

Semantic cache for your LLM apps in Go!

golang vector-database upstash semantic-cache upstash-vector

Updated May 17, 2024
Go

ccheney / reflex

Episodic memory and semantic cache proxy for LLM APIs with ~40% token savings

knowledge-graph developer-tools ai-agents semantic-cache llm-proxy agent-orchestration token-optimization

Updated Dec 16, 2025
Rust

riccardogiuriola / vecs-client-node

Zero-dependency, type-safe Node.js client for Vecs Semantic Cache.

nodejs typescript tcp-client zero-dependency client-library semantic-cache vecs

Updated Dec 15, 2025
TypeScript

paswell-chiks / Optimizing-RAG-with-Hybrid-Search

🔍 Optimize RAG systems by exploring Lexical, Semantic, and Hybrid Search methods for better context retrieval and improved LLM responses.

docker information-retrieval retrieval celery quantization observability bm25 lama rag fastapi huggingface hybrid-search qdrant-vector-database semantic-cache chromadb retrieval-augmented-generation reciprocal-rank-fusion sementic-search

Updated Dec 17, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the semantic-cache topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the semantic-cache topic, visit your repo's landing page and select "manage topics."