A high-performance memory layer for Large Language Models (LLMs) with semantic search, graph-based relationships, and multi-agent coordination. Give your LLM applications long-term memory, context retention, and knowledge persistence.
- Automatic Conversation Management: Drop-in conversation loop that retrieves relevant memories, calls your LLM, extracts facts/preferences/entities from each turn, and learns from outcomes -- zero manual memory plumbing
- LLM-Optimized Memory: Store and retrieve conversation context, facts, and knowledge for LLMs
- Semantic Search: Fast vector similarity search with automatic deduplication - perfect for RAG systems
- Metadata Filtering: Organize and filter memories by custom metadata (tags, categories, session IDs, etc.)
- Multi-layered Memory: Episodic (conversations), semantic (facts), procedural (workflows), and agent state
- Case-Based Reasoning: Store conversation outcomes with reward signals and retrieve past successes/failures to guide future responses
- Graph Memory Layer: Multi-hop traversal for complex reasoning and relationship queries
- FFI Support: Python and JavaScript bindings - integrate with any LLM framework
- Production Ready: Compression, audit trails, background jobs for enterprise LLM applications
from membrain import Conversation
from openai import OpenAI
openai_client = OpenAI()
def llm(messages):
response = openai_client.chat.completions.create(model="gpt-4o-mini", messages=messages)
return response.choices[0].message.content
with Conversation(llm_callable=llm) as conv:
# Membrain automatically retrieves relevant memories, calls your LLM,
# and extracts facts, preferences, entities from each turn
response = conv.reply("I prefer dark mode and work at Acme Corp as a backend engineer.")
response = conv.reply("Can you recommend a code editor setup for me?")
# Memories persist -- ask across turns or even across sessions
response = conv.reply("What do you remember about me?")
# Store the outcome so future conversations learn from this one
conv.end(outcome="User was satisfied with recommendations", reward=1.0)import { Conversation } from "membrain";
import OpenAI from "openai";
const openai = new OpenAI();
async function llm(messages) {
const response = await openai.chat.completions.create({ model: "gpt-4o-mini", messages });
return response.choices[0].message.content;
}
const conv = new Conversation(llm);
try {
const r1 = await conv.reply("I prefer dark mode and work at Acme Corp as a backend engineer.");
const r2 = await conv.reply("Can you recommend a code editor setup for me?");
const r3 = await conv.reply("What do you remember about me?");
conv.end("User was satisfied with recommendations", 1.0);
} finally {
conv.close();
}For fine-grained control, use MembrainClient directly:
from membrain import MembrainClient
memory = MembrainClient()
memory.store_fact("Python PEP 8 is the style guide", confidence=0.95)
memory.store_preference(holder="user", subject="theme", preference="dark mode", strength="strong")
memory.store_entity(name="Acme Corp", entity_type="organization")
results = memory.search("coding standards", limit=5)
for m in results.memories:
print(f"[{m.memory_type}] {m.content}")
memory.close()Each call to conv.reply() runs a full memory-augmented loop:
- Retrieve -- Semantic search finds relevant memories and past conversation cases
- Augment -- Memories and case-based reasoning context are injected into the system prompt
- Generate -- Your LLM is called with the enriched prompt and conversation history
- Extract -- The turn is analyzed to extract facts, preferences, observations, entities, and concepts
- Store -- Extracted memories are persisted for future retrieval across sessions
When you call conv.end(), the full conversation is stored as a case with a reward signal. Future conversations retrieve successful (and unsuccessful) cases to inform response strategy.
| Parameter | Default | Description |
|---|---|---|
llm_callable |
required | Function that takes messages and returns a response string |
system_prompt |
built-in | Custom system prompt |
memory_limit |
10 | Max memories injected per turn |
auto_extract |
true | Automatically extract and store memories |
history_limit |
50 | Max turns kept in the context window |
pip install membrainSet MEMBRAIN_LIB_PATH to point to libmembrain_ffi.so if not auto-detected.
npm install membrainSet MEMBRAIN_LIB_PATH to point to the shared library if needed.
Complete Documentation - Full documentation index
Quick Links:
- Quickstart Guide - Get started in 5 minutes
- Python API Reference - Complete Python API
- JavaScript API Reference - Complete JavaScript/TypeScript API
- Architecture Guide - System design and internals
- Cookbooks - Practical examples
- Basic Usage - Fundamental operations
- Graph Memory - Multi-hop traversal
- Multi-Agent Systems - Collaborative agents
- Advanced Patterns - Optimization techniques
# Build Rust core
cargo build --release
# The FFI library will be at target/release/libmembrain_ffi.soSee CONTRIBUTING.md for development setup and guidelines.