Skip to content

Javierg720/agent-vector-memory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vector-memory

Local-first semantic memory + self-learning for AI coding agents. ~300 lines of Python, one dependency (numpy), zero servers, every artifact a plain file you can read, grep, and git diff.

It gives a file-based agent memory (like Claude Code's) two capabilities that are usually locked behind heavyweight multi-agent frameworks:

Capability What it does Heavyweight-framework equivalent
Vector recall Semantic search over your markdown memory files A binary HNSW vector database
Self-learning A trajectory log of what worked, retrieved alongside facts so past solutions inform new work A "ReasoningBank" / trajectory store
Auto-recall A UserPromptSubmit hook injects relevant memory into every prompt automatically An always-on retrieval agent

Demo

Run it yourself — self-contained, touches only a temp dir:

./examples/demo.sh

Real output (M = stored fact, T = learned trajectory). Watch step 4: the lesson recorded in step 3 outranks the static note on a related query — that's the self-learning loop working.

### 2. Semantic recall — ask in your own words
$ memdb.py search "how do I publish a website and clear the cache" -k 2
1. [M 0.282] CDN deploy notes
     Deploy static sites by uploading build output then purging the CDN cache

### 3. Self-learning — record what worked
$ memdb.py learn --task "Fix 502 from a CDN" --did "..." --avoid "Token was revoked; use dashboard"
Learned trajectory -> trajectories/fix-502-from-a-cdn.md

### 4. The lesson now surfaces on a related query
$ memdb.py search "getting a 502 error behind the cdn" -k 2
1. [T 0.750] fix-502-from-a-cdn          <- the lesson, ranked first
2. [M 0.256] CDN deploy notes

### 5. The auto-recall hook (fed a prompt the way Claude Code feeds it)
$ echo '{"prompt":"deploy the site and clear the cache"}' | recall_hook.py
<vector-memory-recall>
Possibly-relevant saved memory/trajectories for this request:
- [fact 0.54] CDN deploy notes — memory/cdn-deploy.md
</vector-memory-recall>

To turn this into a GIF for the repo header: asciinema rec demo.cast -c './examples/demo.sh' && agg demo.cast demo.gif

Why

Most "agent memory" frameworks ship a vector DB daemon, an extra MCP server, swarm consensus, and a benchmark table — a lot of trust surface for two good ideas. This project distills those two ideas (semantic recall + trajectory learning) into something you fully own: no daemon, no API key required, nothing hidden in a binary store.

The default backend is TF-IDF cosine similarity in pure numpy — instant, offline, free, and deterministic. That's enough for hundreds of memory files. When you want true paraphrase/synonym matching at larger scale, set one env var to switch to real embeddings via any OpenAI-compatible endpoint (OpenAI, DashScope, Together, local Ollama, …).

Install

git clone https://github.com/Javierg720/agent-vector-memory.git
cd agent-vector-memory
python3 -m pip install numpy      # the only dependency
./install.sh                      # optional: registers the auto-recall hook

Point it at your memory folder (a directory of markdown files with name:/description: frontmatter — Claude Code's ~/.claude/memory works out of the box):

export VECTOR_MEMORY_DIR="$HOME/.claude/memory"

Use

# semantic recall over facts + learned lessons
python3 scripts/memdb.py search "deploy a static site and clear the CDN cache" -k 5

# record what worked (the self-learning step)
python3 scripts/memdb.py learn \
  --task "Deploy static site behind a CDN" \
  --did "Upload build/ over SFTP; purge cache via the dashboard" \
  --avoid "Don't rely on the API token — it was revoked; use the dashboard" \
  --tags "deploy,cdn" --date "2026-06-14"

python3 scripts/memdb.py list      # list trajectories
python3 scripts/memdb.py index     # rebuild the index

search and learn rebuild the index automatically; you rarely call index by hand.

The loop

  1. Recall before a non-trivial task — pull relevant facts and past lessons.
  2. Do the work.
  3. Learn after a non-obvious solve — record a trajectory so next time is faster.

Wire up install.sh and step 1 happens on every prompt automatically.

Configuration

All optional, all environment variables:

Var Default Purpose
VECTOR_MEMORY_DIR ~/.claude/memory Folder of markdown "fact" files to index
VECTOR_MEMORY_HOME repo dir Where trajectories/ and index/ live
VECTOR_MEMORY_THRESHOLD 0.10 Min score for the auto-recall hook to surface a hit
VECTOR_MEMORY_MAX_HITS 4 Max hits the hook injects
EMBED_API_KEY unset Set to switch from local TF-IDF to embeddings
EMBED_BASE_URL OpenAI Any OpenAI-compatible /embeddings endpoint
EMBED_MODEL text-embedding-3-small Embedding model name

How it works

  • Documents are markdown files with name: and description:/task: frontmatter. Facts live in VECTOR_MEMORY_DIR; trajectories live in trajectories/. Both are indexed and searched together.
  • Local backend builds a TF-IDF matrix (title/description weighted 3×), L2-normalizes rows, and ranks by cosine similarity to the query vector. Pure numpy.
  • Embedding backend calls an OpenAI-compatible endpoint, caches vectors in index/vectors.json, and ranks the same way.
  • The hook (scripts/recall_hook.py) reads Claude Code's UserPromptSubmit JSON from stdin, searches against the prompt, and prints a short recall block to stdout — which Claude Code injects as context. It exits 0 and stays silent on any error, so it can never break prompt submission.

License

MIT — see LICENSE.

About

Local-first vector memory + self-learning for AI coding agents. Semantic recall + a 'what worked' trajectory store + a Claude Code auto-recall hook, in ~300 lines of numpy. No server, no vector DB daemon.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors