memo — reference manual

The README is the front door. This is the full manual: install detail, per-client MCP setup, the complete tool and CLI surface, every MEMO_* knob, design notes, and how memo compares to other agent-memory projects.

Install detail
MCP setup
MCP tools
Ambient memory
Surfaces — session briefing, semantic map, time-machine
CLI reference
Configuration
Design and comparison

Install detail

Recommended install: keep memo isolated as its own tool. Do not vendor it inside another project's .venv; the MLX runtime, model cache, MCP server, sqlite state, and CLI should move together as one subsystem.

# One-line installer (pipx under the hood, installs GitHub master,
# and configures Claude Code + Codex + OpenCode + Devin Desktop when available)
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | bash
# or install the latest published PyPI release explicitly
pipx install mlx-memo
# or
uv tool install mlx-memo
# or via the Homebrew tap
brew tap jagoff/memo && brew install mlx-memo

Any of those expose two binaries: memo (CLI) and memo-mcp (MCP server). For MCP clients, prefer an isolated tool install so memo's MLX dependencies, sqlite state, and memo-mcp runtime stay independent from whichever repo is active in your shell.

The PyPI distribution is mlx-memo as of 0.5.0. Earlier versions shipped as memo-mcp; the binary names haven't changed, so existing MCP configs keep working. The one-line installer installs GitHub master by default so it can deploy repo changes before the next PyPI release exists.

If you are developing this repo and want the real system install to use your checkout:

pipx install --force /path/to/memo
memo doctor --strict-runtime
memo --version

Installer knobs

# Install the latest published PyPI release instead of GitHub master.
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | MEMO_INSTALL_FROM_PYPI=1 bash

# Pin a published PyPI version.
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | MEMO_VERSION=0.6.0 bash

# Install from an explicit pipx spec (local checkout, git ref, wheel, etc.).
MEMO_INSTALL_SPEC=/path/to/memo ./install.sh

# Skip agent-client configuration during install.
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | MEMO_INSTALL_SKIP_AGENT_CONFIG=1 bash

# Force-skip the MLX model download (models load lazily on first use).
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | MEMO_INSTALL_DOWNLOAD_MODELS=no bash

# Force-yes the MLX model download (skip the interactive confirmation).
curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | MEMO_INSTALL_DOWNLOAD_MODELS=yes bash

Model download is part of memo's structure (embedder + reranker + chat models are required for retrieval and ambient recall). On an interactive terminal the installer asks for confirmation (default Y); on a piped install the default is also yes. Re-run the download manually any time:

# Download all default-profile models (~7 GB, shows progress, safe to re-run)
MEMO_NONINTERACTIVE=1 memo prewarm --download-all

# Or download individual models with the HF CLI
hf download mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ
hf download mku64/Qwen3-Reranker-0.6B-mlx-8Bit
hf download mlx-community/Qwen2.5-3B-Instruct-4bit
hf download mlx-community/Qwen2.5-7B-Instruct-4bit

# Optional quality profile.
hf download mlx-community/Qwen3-Embedding-4B-4bit-DWQ
hf download mlx-community/Qwen3-4B-Instruct-2507-4bit-DWQ-2510

Stack

Component	Choice	Why
LLM (chat)	`Qwen2.5-7B-Instruct-4bit` + `3B helper` via `mlx-lm`	Two-tier; 7B for `ask()` synthesis, 3B for cheap helpers. Both 4-bit fit comfortably.
Embedder	`Qwen3-Embedding-0.6B-4bit-DWQ` by default; `Qwen3-Embedding-4B-4bit-DWQ` in `quality`	1024-dim default, 2560-dim quality. Choose via `MEMO_MODEL_PROFILE`.
Reranker	`mku64/Qwen3-Reranker-0.6B-mlx-8Bit`	Cross-encoder over top-30 from vec+BM25, then alpha-fusion.
Vector store	`sqlite-vec`	One file, no daemon, embedded. Reset = `rm memvec.db`.
Source of truth	Markdown files under `MEMO_DATA_DIR` with YAML frontmatter	Human-editable; sync via iCloud/git/Syncthing.
MCP transport	`fastmcp`	Stdio out of the box.

Installing on another Mac

For a fresh Apple Silicon Mac, run the one-line installer first, then bring over the corpus. (On Linux / Ubuntu, install the CPU backend with pipx install "mlx-memo[cpu]" instead — see ubuntu.md.)

curl -fsSL https://raw.githubusercontent.com/jagoff/memo/master/install.sh | bash
memo doctor --strict-runtime
memo install-slash --client claude-code --client codex --client opencode --client devin-desktop

To move existing data:

# On the old Mac: portable zip with .md memories + memvec.db + history.db.
memo backup --out ~/Desktop/memo-transfer.zip

# On the new Mac, after installing memo:
memo restore ~/Desktop/memo-transfer.zip --reindex --yes
memo doctor --strict-runtime

If your memories already live in an iCloud/Syncthing/Git-synced Obsidian folder, point the new Mac at that same folder instead of copying the zip:

memo init
memo reindex

MEMO_DATA_DIR holds the human-readable .md source of truth; MEMO_STATE_DIR (default ~/.local/share/memo) holds rebuildable indexes plus sidecars such as history.db — keep history.db if you want time-machine snapshots to survive the move. Full checklist: install-new-mac.md.

Verify no old install is being used

which -a memo
which -a memo-mcp
pipx list --short
memo doctor --strict-runtime

A healthy isolated install prints a single memo path, resolves memo and memo-mcp from the same environment, and passes memo doctor --strict-runtime.

MCP setup

After installing mlx-memo, register the MCP with your client. The memo CLI prints commands pinned to the resolved memo-mcp executable so clients don't accidentally start a copy from a project .venv:

memo install-slash

install-slash configures Claude Code, Codex, Devin Desktop, and Devin where each supports it, and forwards current MEMO_* model/storage env vars into each MCP client config. This matters with the 2560-dim quality embedder: GUI clients often don't inherit your shell env, and a 1024/2560 mismatch breaks semantic search until the config is updated or memvec.db is rebuilt.

Released wheels include the Claude/Codex/Devin agent assets, so a normal pipx / uv tool / Homebrew install is enough. When developing from a local checkout, pass --repo /path/to/memo to test uncommitted plugin changes.

Tools surface inside the agent as mcp__memo__memo_*. Agent installs default to a five-tool surface (briefing, search, ask, get, save) so administrative schemas don't consume model context — set MEMO_MCP_PROFILE=core (25 tools) or full (everything) only for a dedicated administrative client.

Claude Code

memo mcp-command --client claude-code
# then run the printed command, e.g.
claude mcp add-json -s user memo '{"type":"stdio","command":"/Users/you/.local/pipx/venvs/mlx-memo/bin/memo-mcp","args":[],"env":{"MEMO_NONINTERACTIVE":"1"}}'

Or hand-edit ~/.claude.json:

{
  "mcpServers": {
    "memo": {
      "type": "stdio",
      "command": "/path/to/memo-mcp",
      "args": [],
      "env": {
        "MEMO_NONINTERACTIVE": "1",
        "MEMO_MCP_PROFILE": "agent"
      }
    }
  }
}

Restart Claude Code. If it starts the wrong server, run memo doctor --strict-runtime — it warns when memo/memo-mcp resolve from a project-local venv or from different environments.

Codex CLI

memo mcp-command --client codex
# then run the printed command, e.g.
codex mcp add memo --env MEMO_NONINTERACTIVE=1 --env MEMO_MCP_PROFILE=agent -- /Users/you/.local/pipx/venvs/mlx-memo/bin/memo-mcp
codex mcp list
memo install-slash --client codex   # also installs the memo skill

Current Codex CLI builds list only built-in slash commands in the TUI dispatcher. The installer writes the exact memo skill to $CODEX_HOME/skills/memo/SKILL.md; Codex loads it as a model-visible skill and routes to the memo MCP server, but /memo won't appear in that TUI menu until Codex exposes custom skills there.

Devin for Terminal

memo mcp-command --client devin
# then run the printed command, e.g.
devin mcp add -s user -e MEMO_NONINTERACTIVE=1 memo -- /Users/you/.local/pipx/venvs/mlx-memo/bin/memo-mcp
devin mcp list

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "memo": {
      "command": "/path/to/memo-mcp",
      "env": { "MEMO_NONINTERACTIVE": "1" }
    }
  }
}

Devin Desktop

Devin Desktop stores MCP servers in ~/.devin/mcp.json. memo can write that file directly (memo install-slash --client devin-desktop) or print the JSON block for manual editing (memo mcp-command --client devin-desktop). It preserves any existing mcpServers and only replaces the memo entry. Set DEVIN_DESKTOP_MCP_CONFIG for a non-standard config path.

Cursor / Cline / Continue

Each has its own MCP config UI but the contract is the same: register a stdio server pointing at the memo-mcp binary. Print a portable mcpServers block with memo mcp-command --client json.

The `/memo` slash command

/memo ships only for CLIs that can expose an exact custom /memo; the backend is always the same isolated memo-mcp server.

# Claude Code — registers the /memo skill, MCP server, and ambient hooks together
memo install-slash --client claude-code
# or manually:
claude plugin marketplace add jagoff/memo
claude plugin install memo@memo -s user

# Codex — installs a user skill + the Codex plugin
memo install-slash --client codex

# Devin — installs the /memo router skill under ~/.config/devin/skills/
memo install-slash --client devin

Restart the client (or open a new session) after installing from the CLI so the slash-command registry reloads. The skill routes user input to the right MCP tool:

Input	Action
`/memo <query>`	semantic search (k=5, snippet body)
`/memo`	smart capture — distills the turn's insight and saves it
`/memo list [n]`	recent memories
`/memo save <text>`	save with auto-derived type/tags
`/memo get <id\|prefix>`	full record (prefix ≥4 chars)
`/memo update <id\|prefix> [flags] [body]`	patch metadata or body
`/memo delete <id\|prefix>`	delete (asks confirmation)
`/memo ask <question>`	RAG synthesis with citations
`/memo stats`	totals + paths + models
`/memo reindex`	absorb edits made directly in Obsidian
`/memo history [op] [id]`	audit log of save/update/delete
`/memo consolidate [threshold]`	cluster near-duplicates + merge proposals
`/memo map [--output FILE]`	generate 2D semantic canvas HTML
`/memo doctor [--gc] [--fix]`	self-check + orphan detect

MCP tools

Tool	What it does
`memo_save(content, title?, type?, tags?)`	Persist a new memory; returns the full record.
`memo_search(query, limit?, type?, body_chars=280, mode="hybrid")`	Top-k. `hybrid` (default) fuses vec + bm25 via RRF, then optionally re-ranks. `vec` is semantic only; `bm25` is keyword (FTS5 unicode61, diacritic-stripping for Spanish).
`memo_list(limit?, type?)`	Recent by `updated` desc.
`memo_get(id)`	Full record. Accepts a unique prefix ≥4 chars (git-style); returns `{"error": "ambiguous", "matches": [...]}` on collision.
`memo_update(id, title?, type?, tags?, content?)`	Patches fields; re-embeds only if body changed.
`memo_reindex()`	Re-scan vault, re-embed entries whose `body_hash` diverged.
`memo_delete(id)`	Removes from vec + disk.
`memo_ask(question)`	RAG synthesis; cites memories by id.
`memo_chat_ask(question, history?, context?)`	Chat-shaped RAG envelope (`memo.chat_ask.v2`) with answer, citations, retrieval trace, and synthesis status.
`memo_stats()`	Counts, paths, active models.
`memo_history(limit?, record_id?)`	Recent save/update/delete events, optionally filtered to one record.
`memo_record_diff(id, limit?)`	Chronological audit trail for one record with field-level diffs.
`memo_consolidate()`, `memo_extract_entities()`, `memo_entities()`	Corpus maintenance — see CHANGELOG.

Ambient memory

Install the bundled Claude Code plugin and memo silently consults your past on every prompt and injects the most relevant memories as additionalContext — the agent sees them before answering, no manual invocation. With the plugin, six hooks plug in automatically:

Event	Command	Mode	Budget	Purpose
`SessionStart` (startup/clear)	`memo prewarm`	async	30 s	Pre-loads MLX embedder + reranker; writes warm-signal file
`SessionStart` (startup/clear)	`memo recall-daemon start`	async	5 s	Starts the recall daemon (keeps embedder in RAM; <200 ms recall)
`SessionStart` (startup/resume)	`memo briefing`	sync	5 s	Session-briefing panel: open loops, memory of the day, last session
`UserPromptSubmit`	`memo recall-hook`	sync	8 s	Queries the recall daemon (fast path) or falls back to BM25 when cold
`Stop`	`memo capture-stop`	async	30 s	Extracts insights from the finished exchange via helper LLM
`Stop`	`memo session checkpoint`	async	5 s	Snapshots session state for crash recovery

Note: idle capture and other hooks require Claude Code's hook system (hooks/hooks.json). Other agents (OpenCode, Devin Desktop, …) using MCP only get the memo_* tools — add a similar idle trigger via the agent's native hook system or use memo session idle-maintenance --mode capture. All hooks run 100% local; your prompts never leave the machine.

Recall daemon

The recall daemon is the hot-path optimization that makes ambient recall feel instant. Without it, each UserPromptSubmit spawns a fresh Python process that re-imports MLX from disk (~1–2 s even when cached). With it, a single long-lived process keeps the embedder in RAM and answers socket requests in <200 ms.

SessionStart
  └─ memo recall-daemon start (async)
       └─ loads Memory + embedder
       └─ listens on ~/.local/share/memo/recall.sock

UserPromptSubmit
  └─ memo recall-hook
       ├─ daemon running? → socket request → <200 ms → additionalContext
       └─ daemon not ready? → BM25 fallback → ~100 ms → additionalContext

memo recall-daemon start    # start in background (also auto-started by the hook)
memo recall-daemon stop     # send SIGTERM + cleanup
memo recall-daemon status   # pid, socket path, warm/cold state

Logs: ~/Library/Logs/memo/recall-daemon.log. The daemon restarts on the next session start if macOS killed it under memory pressure.

Recall tuning

Env var	Default	Purpose
`MEMO_RECALL_DISABLE`	unset	Set to `1` to skip recall entirely
`MEMO_RECALL_TOP_K`	`3`	Max memories to inject
`MEMO_RECALL_MIN_SIM`	`0.6`	Cosine similarity floor
`MEMO_RECALL_MIN_PROMPT_CHARS`	`12`	Skip very short prompts
`MEMO_RECALL_BODY_CHARS`	`240`	Snippet length per memory
`MEMO_RECALL_SKIP_SLASH`	`1`	Skip recall on `/` prompts
`MEMO_RECALL_TOKEN_BUDGET`	`0`	When > 0, pack memories greedily until ~N tokens; truncate tail to fit
`MEMO_RECALL_PROJECT_BOOST`	`0.15`	Additive score boost for memories whose tags match the current project tag
`MEMO_RECALL_MIN_BODY_CHARS`	`40`	Filter out stub memories (empty or near-empty bodies)
`MEMO_RECALL_FORCE_MODE`	unset	Set to `1` to disable the warm-signal cold-start check
`MEMO_RECALL_DEBUG`	unset	Print failure reasons to stderr

The MIN_SIM=0.6 floor is empirically tuned: on a 223-doc corpus, a relevant query returns hits at 0.71–0.74 while an off-topic query ("how to bake apple pie") returns 0 hits at 0.6 (3 noise hits at 0.51–0.56 cut by the floor). Tune lower (0.5) on sparse corpora, higher (0.7) for high-precision only.

Capture tuning

Env var	Default	Purpose
`MEMO_CAPTURE_CONTEXT_TURNS`	`3`	Recent exchanges fed to the helper LLM (catches multi-turn decisions)
`MEMO_CAPTURE_COOLDOWN_MIN`	`0`	Min minutes between captures in the same session
`MEMO_CAPTURE_MIN_WORDS`	`15`	Minimum word count for an extracted insight (0 disables)
`MEMO_CAPTURE_DEBUG`	unset	Print extraction results to stderr

The capture pipeline applies a quality gate before saving: insights are discarded if too short (< MEMO_CAPTURE_MIN_WORDS) or if they start with session-narrative openers like "the user…", "we discussed…", "i helped…". Only specific, durable knowledge passes through, which keeps recall precision high over time.

Hook observability — `memo hook-log`

Every recall-hook invocation is appended to a JSONL ring buffer at ~/.local/share/memo/recall.log (auto-rotated at ~200 KB):

2026-05-16 14:32:01  vec     daemon   3 hits   187 ms   "how can we improve all of this?"
2026-05-16 14:31:44  bm25    subproc  1 hit    94 ms    "resolve todo"
2026-05-16 14:28:12  vec     daemon   0 hits   203 ms   "what does prewarm do"

Each row shows timestamp · search mode (vec/bm25) · path (daemon/subprocess) · hit count · latency · prompt snippet.

memo hook-log              # last 20 entries
memo hook-log --limit 100
memo hook-log --follow     # stream live (Ctrl+C to stop)

Backfill from past Claude Code conversations

memo mine-history walks ~/.claude/projects/<hash>/*.jsonl, runs the same prefilter + helper-LLM extract + embedding-dedup pipeline as the live capture hook, and saves what's new (resumable per file):

memo mine-history --since 30 --limit 20     # last 30 days, 20 newest sessions
memo mine-history --dry-run --debug         # cost estimation, no writes

Auto-reindex on edit

Editing a memory directly in Obsidian normally needs a manual memo reindex. memo watch (foreground) or memo install-watcher (background launchd job) debounces FS events and runs Memory.reindex() automatically. Logs land in ~/Library/Logs/memo/.

Project-scoped recall

memo save auto-attaches a project:<repo> tag derived from the git toplevel of your cwd (or MEMO_PROJECT_TAG). The recall hook reads cwd from the hook payload and boosts memories whose tags match by MEMO_RECALL_PROJECT_BOOST (default 0.15). Opt out per-call with memo save --no-project-tag; disable globally with MEMO_AUTO_PROJECT_TAG=0.

Surfaces

Session briefing — `memo briefing`

memo briefing is the SessionStart hook entrypoint. Every new session it emits an additionalContext panel with three blocks:

Last session in this project — summary of the most recent session in the current cwd, with a one-line claude --resume <session_id> for instant crash recovery.
Open loops — the N memories most recently updated (default 7-day window), numbered for interactive selection. Say "give me loop 2" and the agent retrieves it.
Memory of the day — one memory picked deterministically by a SHA-256 hash of today's date, biased toward the least-recently-touched entries so the corpus rotates over time.

## Briefing

**Last session in this project** (12m ago): reviewing the project…
`claude --resume be72126f-3bcb-4faa-9a0f-dd97b8caa296`

### Open loops (last 7 days)

1. `91fc486c` **note** · memo diff as a real change surface — today [memory, versioning]
2. `5da4cdc1` **note** · Smarter recall hook — today [memory, recall]
…

### Memory of the day
`064031dd` **fact** · sqlite-vec L2 normalisation invariant — 3 days ago
> The embedder must L2-normalise before storing…

_Continue with: `give me loop N` · `/memo get <id>` · `/memo ask <question>`_

Env var	Default	Purpose
`MEMO_BRIEFING_DISABLE`	unset	Set to `1` to skip the panel
`MEMO_BRIEFING_LOOPS_N`	`5`	Number of open loops to show
`MEMO_BRIEFING_LOOPS_DAYS`	`7`	Recency window for open loops
`MEMO_BRIEFING_DEBUG`	unset	Print failures to stderr

Semantic map — `memo map`

memo map reads all embeddings in memvec.db, projects them to 2D via UMAP (if umap-learn is installed) or PCA (numpy fallback), and renders a self-contained interactive HTML file.

memo map                                      # generate + open in browser
memo map --output ~/Desktop/map.html --no-open
memo map --limit 200                          # most recent 200
memo map --no-animate                         # skip the timeline animation

The HTML colours points by type, shows title/tags/date on hover, opens full metadata on click, supports a search filter, and animates corpus growth over time. For better cluster topology on 50+ entries, install umap-learn (pipx runpip mlx-memo install umap-learn); without it, PCA is used.

Time-machine

memo is the only agent-memory product that lets you rewind the corpus to any past date. history.db is an append-only audit log of every save/update/delete; a snapshot at any T is rebuilt by replaying events in reverse from "now". See time-machine.svg for the algorithm.

memo as-of ask "MLX vs Ollama" --date 2026-02-01   # what did I think 3 months ago?
memo diff --from 2026-03-01 --to 2026-04-30        # what changed between releases?
memo as-of search "auth middleware" --date 2026-03-15
memo as-of list --date 2026-03-01                  # memories that existed then

Use cases: debugging agent regressions, reproducible AI behaviour (serve a past snapshot as an alternate MCP), personal audit, and compliance ("what did the model know when it took action X?").

CLI reference

# ── Core CRUD ──────────────────────────────────────────────────────────────
memo save 'body markdown' --title 'X' -t mlx -t local
memo search 'query' --limit 5
memo list --limit 20 --type decision
memo get <id>
memo update <id> --title 'X2' -t mlx -t local --type decision
memo update <id> --content -      # read replacement body from stdin
memo delete <id> --yes
memo reindex                      # absorb edits made directly in Obsidian
memo stats
memo ask 'what changed in the embedder this month?'

# ── History & audit ────────────────────────────────────────────────────────
memo record-history <id>                # chronological audit trail for one record with field diffs
memo history                      # recent save/update/delete events across all records

# ── Ambient memory commands (also run by hooks) ────────────────────────────
memo briefing                     # preview the SessionStart panel in the terminal
memo recall-hook                  # UserPromptSubmit hook (reads JSON from stdin)
memo prewarm                      # pre-load MLX models (SessionStart hook)
memo capture-stop                 # extract insights from last exchange (Stop hook)
memo session checkpoint           # snapshot current session state (Stop hook)
memo session recent --limit 5     # list recent sessions

# ── Semantic map ───────────────────────────────────────────────────────────
memo map                         # generate + open in browser (UMAP or PCA → Plotly HTML)
memo map --output ~/Desktop/map.html --no-open
memo map --limit 200 --no-animate

# ── Setup & maintenance ────────────────────────────────────────────────────
memo doctor                       # self-check
memo doctor --gc                  # report orphans (store ↔ disk)
memo doctor --gc --fix            # drop orphan store rows (.md never auto-deleted)
memo install-slash                # configure Claude Code, Codex, Devin Desktop, Devin
memo mcp-command --client devin-desktop # print Devin Desktop mcp.json block
memo init                         # re-run first-run picker
memo migrate-vault <new-path>     # move memories to a different folder
memo backup --out memo.zip        # backup .md files + index

# ── Time-machine ───────────────────────────────────────────────────────────
memo as-of search 'query' --date 2026-03-01
memo as-of ask 'question' --date 2026-03-01
memo as-of list --date 2026-03-01
memo diff --from 2026-03-01 --to 2026-04-30

# ── Knowledge graph ────────────────────────────────────────────────────────
memo entities                     # top entities across the corpus
memo entity <name>                # memories that mention a specific entity
memo extract-entities --all       # populate the entity graph (Qwen 3B, batch)
memo consolidate                  # cluster near-duplicates + merge proposals

# ── Backfill & watching ────────────────────────────────────────────────────
memo mine-history --since 30      # backfill memories from past Claude Code chats
memo watch                        # foreground file-watcher: auto-reindex on .md edit
memo install-watcher              # background watcher via launchd plist
memo uninstall-watcher            # remove the launchd watcher job

# ── Recall daemon ──────────────────────────────────────────────────────────
memo recall-daemon start          # start the persistent recall daemon
memo recall-daemon stop
memo recall-daemon status

# ── Observability ──────────────────────────────────────────────────────────
memo hook-log                     # last 20 recall-hook entries: mode, via, hits, latency
memo hook-log --limit 50
memo hook-log --follow            # stream new entries as they arrive

# ── Updates ────────────────────────────────────────────────────────────────
memo self-update                  # upgrade via pipx/uv + re-warm models
memo self-update --check          # check PyPI for a newer version without installing

# ── Live dashboard ─────────────────────────────────────────────────────────
memo tui                          # live terminal dashboard (Ctrl+C exits)

Live dashboard — `memo tui`

Six panels, refresh every second: corpus (totals, project tags, top types), runtime (MLX warm/cold flags, vault size, watcher state), recent saves, recent recalls (mode + path per row, live daemon status), top tags, and activity (14-day saves/recalls sparklines). It reads read-only from history.db, the JSONL recall log, the daemon PID file, and the warm-signal file. Quit with q, ESC, or Ctrl+C.

Updating — `memo self-update`

memo self-update detects the active install method (checks pipx list then uv tool list), runs the appropriate upgrade, and re-warms models with memo prewarm --download-all. memo self-update --check compares installed vs latest PyPI without installing.

Configuration

All env vars are optional; defaults aim at a fresh Apple Silicon Mac (or a Linux/Ubuntu [cpu] install — see ubuntu.md). On first run in an interactive shell, an arrow-key picker asks where memories should live and persists the choice to ~/.config/memo/config.toml. Re-run it with memo init. Hooks get MEMO_NONINTERACTIVE=1 so they never trigger the picker.

Resolution precedence (highest first): explicit kwargs → MEMO_* env vars → ~/.config/memo/config.toml → legacy MEMO_VAULT_PATH + MEMO_MEMORY_SUBDIR → hardcoded defaults.

Storage & paths

Env var	Default	What
`MEMO_DATA_DIR`	`~/Documents/memo`	Where memory `.md` files live
`MEMO_VAULT_PATH`	`(unset)`	Optional Obsidian vault for `memo ingest`
`MEMO_STATE_DIR`	`~/.local/share/memo`	sqlite-vec DB + state
`MEMO_CONFIG_FILE`	`~/.config/memo/config.toml`	Override config-file path
`MEMO_NONINTERACTIVE`	unset	Set to `1` in hooks to skip the first-run picker

Models

Env var	Default	What
`MEMO_MODEL_PROFILE`	`balanced`	Model bundle: `light`, `balanced`, or `quality`
`MEMO_LLM_MODEL`	`mlx-community/Qwen2.5-7B-Instruct-4bit`	Chat tier
`MEMO_HELPER_MODEL`	`mlx-community/Qwen2.5-3B-Instruct-4bit`	Helper tier
`MEMO_EMBEDDER_MODEL`	`mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ`	Embedder
`MEMO_EMBEDDER_DIMS`	`1024`	Embedding dim — must match the embedder
`MEMO_RERANKER_ENABLED`	`1` in `balanced`/`quality`	Enable cross-encoder rerank for hybrid search
`MEMO_RERANKER_MODEL`	`mku64/Qwen3-Reranker-0.6B-mlx-8Bit`	MLX reranker model
`MEMO_RERANK_INPUT_K`	`5`	Hybrid candidates sent to the reranker
`MEMO_RERANK_FUSION_ALPHA`	`0.7`	Weight of reranker score vs RRF position bonus

Search

Env var	Default	What
`MEMO_MAX_CONTENT_CHARS`	`64000`	Truncate body before embed
`MEMO_SEARCH_DEFAULT_LIMIT`	`10`	Default `--limit` for search
`MEMO_SEARCH_DECAY_HALFLIFE`	`0`	When > 0, blend recency into scores. Half-life in days (`exp(-days/N)`)
`MEMO_SEARCH_DECAY_ALPHA`	`0.15`	Weight of decay signal vs raw similarity

Tagging

Env var	Default	What
`MEMO_AUTO_PROJECT_TAG`	`1`	Auto-add `project:<repo>` tag from git toplevel on save
`MEMO_PROJECT_TAG`	unset	Explicit project tag (overrides git-toplevel detection)

Recall, capture, and briefing knobs are listed under Ambient memory.

Model profiles

light: 0.6B embedder, Qwen2.5 chat/helper, no reranker. Best for low-latency hooks.
balanced: 0.6B embedder + 0.6B reranker + Qwen2.5 chat/helper. Default for most users.
quality: 4B embedder (2560 dims) + 0.6B reranker + Qwen3 4B chat. Requires rm ~/.local/share/memo/memvec.db && memo reindex when switching from 1024-dim profiles.

If models are still downloading, you can save without MLX and keep keyword search available:

memo save "text to remember" --title "Short title" --defer-embed
memo search "text" --mode bm25
memo reindex     # later, once the embedder is cached

Upgrading the embedder

The default 0.6B is fast (~50 ms/embed) and small (~600 MB) but recall on diffuse queries can be noisy. For the 200–2000 memories range, swap to a larger variant.

Model	Dims	Disk	Recall	Per-embed
`Qwen3-Embedding-0.6B-4bit-DWQ` (default)	1024	~600 MB	OK	~50 ms
`Qwen3-Embedding-4B-4bit-DWQ`	2560	~3 GB	better	~200 ms
`Qwen3-Embedding-8B-4bit-DWQ`	4096	~5 GB	best	~400 ms

hf download mlx-community/Qwen3-Embedding-4B-4bit-DWQ   # 1) pre-download
export MEMO_MODEL_PROFILE=quality                       # 2) point memo at it
memo backup --out memo-pre-4b.zip                       # 3) backup before re-embed
rm ~/.local/share/memo/memvec.db && memo reindex        # 4) wipe + rebuild
memo doctor --strict-runtime

The dim mismatch is a hard error — MEMO_EMBEDDER_DIMS must match the new model's hidden size, and memo doctor validates it at load.

Design and comparison

Design notes

One sqlite file, no Qdrant. sqlite-vec outperforms a small Qdrant snapshot at the corpus size memo targets (a few thousand entries, single writer). One file makes reset trivial: rm memvec.db.
Embed title + body together. Titles carry the highest-density retrieval signal; prepending also protects the title from head-truncation on long bodies. Pure retag/type changes skip the embedder.
.md is the storage of record. Edit in Obsidian; the next memo reindex picks it up via body_hash mismatch.
Head-truncate long inputs + append EOS. The embedder caps at 512 tokens; we head-truncate and explicitly append <|im_end|> so Qwen3-Embedding's last-token pool lands on the EOS hidden state it was fine-tuned for.
Asymmetric retrieval. Queries get an Instruct: …\nQuery: … prefix; documents go raw. Without the prefix, cosine collapses toward 0.
Cosine distance metric. The vec0 schema declares distance_metric=cosine, so score = 1 − distance is interpretable in [0, 1].
No Ollama dependency, anywhere. pyproject.toml doesn't declare it; doctor doesn't probe :11434.

How memo compares

memo's neighbours diverge on the things that matter day-to-day: where the model runs, where the data lives, how recall is wired, and whether you can read your own memory in plain text.

	memo	`mem0`	`letta`	`cognee`	`supermemory`	MCP `memory` ref
Runtime	MLX, in-process	Cloud API or Ollama	Postgres + LLM API	Cloud or Ollama	Cloud SaaS	Node, in-process
Network in hot path	0	yes or `:11434`	yes (LLM API)	yes (LLM API)	always	yes (LLM API)
Vector store	sqlite-vec (one file)	Qdrant / pgvector	Postgres + pgvector	LanceDB / Qdrant	hosted	in-memory JSON
External daemons	none (recall daemon optional)	Ollama + Qdrant	Postgres	Postgres / vector DB	none (SaaS)	none
Storage of record	markdown files	DB blob	DB rows	DB rows + graph	hosted DB	JSON entity graph
Human-readable / editable	✅ Obsidian/vim	❌	❌	❌	❌	partial (JSON)
Hybrid retrieval + reranker	✅ vec + BM25 + RRF + cross-encoder	vec	vec	vec + graph	vec	entity-based
Ambient recall (zero invoke)	✅ hooks + daemon (<200 ms)	❌	n/a	❌	❌	❌
Time-machine (past snapshots)	✅ `memo as-of …`	❌	❌	❌	❌	❌
License	MIT	Apache-2.0	Apache-2.0	Apache-2.0	proprietary	MIT

Projects move fast — cells reflect the public state of each repo at the time of writing. PR a correction if any is stale.

The differentiators in plain terms:

Time-machine — every other store serves current state only. memo rebuilds any past corpus state from its audit log. No competitor can retrofit this without an audit log they don't have.
100% local hot path, no Ollama — LLM, embedder, and reranker run in-process via MLX. No :11434 round-trip, no Docker, no provider key.
Markdown is the storage of record — plain .md you can edit, sync, and grep; the sqlite index is rebuildable.
Ambient recall + session awareness as a turnkey hook bundle — the agent sees the right memories before it answers, and the corpus grows on its own.
MCP is a primary interface — same stdio contract for every client on day one, with a deliberately tiny default tool surface.

When not to pick memo: you need a hosted multi-tenant service (supermemory/mem0 cloud); you want an explicit core/archival agent runtime (letta); or you want a knowledge-graph + ontology layer (cognee). (Not on Apple Silicon? memo still runs standalone on Linux/Ubuntu via a CPU backend — search / recall / save — but the reranker and the LLM features (ask / synthesize / dream) are MLX-only. See ubuntu.md.)

Experimental modules

These ship in the package but are not covered by CI, not exposed via MCP tools, and may change without notice. They stay inside memo's pillar — local semantic storage, retrieval, and corpus-level utilities; coordination, federation, and orchestration belong outside memo's surface.

Module	What it does
`multimodal`	Cross-modal semantic search over images, audio, and text
`collaborative`	Shared knowledge graph across multiple users
`sharing`	Per-memory sharing links and permission grants
`encryption`	AES-256-GCM file-level primitives (gated OFF; `MEMO_ENCRYPTION_ENABLED=1`)
`contradict`	Contradiction and staleness radar with triage workflow
`chunker`	Heading-aware sub-document chunking for long memories
`crossref`	Obsidian `[[wikilink]]` backlink index and multi-hop traversal
`contextual`	Conversation-history-aware recall boosting
`navigation`	BFS path finding and community detection on the entity graph
`sync`	Multi-device sync and compressed backups
`versioning`	Per-memory version history and unified-diff rollback

The current inventory of broader corpus/workflow experiments lives in src/memo/experimental_index.md.

Consciousness-stack integration

memo is one of three sovereign systems: Synapse (federator), Memflow (cross-Mac operational continuity), and memo (semantic corpus). Integration is opt-in everywhere — single-Mac users without Synapse or Memflow see zero behaviour change.

Surface	Doc	Default	Opt-in knob
Synapse adapter — `MemoSynapseBackend`, provenance, freeze-write	synapse-adapter.md	OFF	`MEMO_RESPECT_SYNAPSE_FREEZE=1`
Embedder daemon — shared MLX sidecar protocol	embedder-daemon.md	ON via `SessionStart`	—
Contradict loop — synapse pulls via `memo contradict list --json`	contradict-loop.md	ON (synapse-side)	—
Receipts — operational breadcrumbs to memflow	receipts.md	OFF	`MEMO_EMIT_RECEIPTS=1`
Briefing — synapse `present_state` in the session panel	briefing.md	ON when `synapse` is on PATH	`MEMO_BRIEFING_SYNAPSE_DISABLE=1`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memo — reference manual

Install detail

Installer knobs

Stack

Installing on another Mac

Verify no old install is being used

MCP setup

Claude Code

Codex CLI

Devin for Terminal

Claude Desktop

Devin Desktop

Cursor / Cline / Continue

The `/memo` slash command

MCP tools

Ambient memory

Recall daemon

Recall tuning

Capture tuning

Hook observability — `memo hook-log`

Backfill from past Claude Code conversations

Auto-reindex on edit

Project-scoped recall

Surfaces

Session briefing — `memo briefing`

Semantic map — `memo map`

Time-machine

CLI reference

Live dashboard — `memo tui`

Updating — `memo self-update`

Configuration

Upgrading the embedder

Design and comparison

Design notes

How memo compares

Experimental modules

Consciousness-stack integration

FilesExpand file tree

reference.md

Latest commit

History

reference.md

File metadata and controls

memo — reference manual

Install detail

Installer knobs

Stack

Installing on another Mac

Verify no old install is being used

MCP setup

Claude Code

Codex CLI

Devin for Terminal

Claude Desktop

Devin Desktop

Cursor / Cline / Continue

The /memo slash command

MCP tools

Ambient memory

Recall daemon

Recall tuning

Capture tuning

Hook observability — memo hook-log

Backfill from past Claude Code conversations

Auto-reindex on edit

Project-scoped recall

Surfaces

Session briefing — memo briefing

Semantic map — memo map

Time-machine

CLI reference

Live dashboard — memo tui

Updating — memo self-update

Configuration

Upgrading the embedder

Design and comparison

Design notes

How memo compares

Experimental modules

Consciousness-stack integration

The `/memo` slash command

Hook observability — `memo hook-log`

Session briefing — `memo briefing`

Semantic map — `memo map`

Live dashboard — `memo tui`

Updating — `memo self-update`