Skip to content

Latest commit

 

History

History
99 lines (71 loc) · 3.78 KB

File metadata and controls

99 lines (71 loc) · 3.78 KB

Architecture

Overview

sdr-memory provides long-term memory for LLM agents using Sparse Distributed Representations (SDRs). It runs as a single-process daemon with a Unix socket interface and persists memories in SQLite.

┌─────────────┐     JSON/socket     ┌──────────────┐     SQLite     ┌────────────┐
│  LLM Agent  │ ──────────────────> │  MemoryDaemon │ ────────────> │  memory.db │
│  or Script  │ <────────────────── │  (server.py)  │ <──────────── │  (WAL mode)│
└─────────────┘                     └──────────────┘                └────────────┘

Core Pipeline

1. Text Encoding (sdr.py)

Text is converted to a sparse binary vector via character-level trigram hashing:

  1. Trigrams: "hello" -> [" h", " he", "hel", "ell", "llo", "lo ", "o "]
  2. Hashing: Each trigram is hashed to a position in a 4096-bit vector
  3. Sparsity cap: At most 80 bits are active (deterministic: lowest indices kept)
  4. Packing: The 4096-bit vector is packed into 512 bytes for storage

The encoding is:

  • Deterministic: Same text always produces the same SDR
  • Language-agnostic: Works with any language (trigrams capture character patterns)
  • Overlap-preserving: Similar texts share active bits, enabling similarity search

2. Salience Filter (salience.py)

Before storing, text passes through a salience filter that rejects:

  • Procedural narration ("let me check...", "starting analysis...")
  • Bare file paths
  • Very short text (<8 characters)

And accepts:

  • High-signal operational keywords (errors, incidents, resolutions)
  • Medium-length declarative facts (>=60 characters)

3. Storage (memory.py)

MemoryStore manages a SQLite database with WAL mode for concurrent reads:

Column Type Description
id INTEGER Auto-incrementing primary key
text TEXT Normalized memory text
bitvec BLOB Packed 512-byte SDR
active_bits INTEGER Number of active bits
metadata_json TEXT JSON metadata (source, session, etc.)
created_ts REAL Unix timestamp
updated_ts REAL Unix timestamp

4. Retrieval

Query text is encoded to an SDR, then compared against all stored SDRs using Hamming distance. Results are scored as 1 - (hamming_distance / 4096) and returned sorted by decreasing similarity.

This is a brute-force scan. For typical agent memory sizes (<100K memories), this completes in milliseconds thanks to NumPy's vectorized XOR operations.

5. Daemon Protocol (server.py)

The daemon speaks JSON-line protocol over a Unix stream socket. Each request is a single JSON line, and each response is a single JSON line.

Actions:

{"action": "ping"}
{"action": "stats"}
{"action": "store", "text": "...", "metadata": {...}}
{"action": "query", "text": "...", "limit": 5}

Design Decisions

Why SDR instead of vector embeddings?

  • Zero dependencies: No GPU, no embedding model, no vector DB server
  • Deterministic: No model loading, no floating-point drift
  • Fast: Hamming distance on packed bits is extremely cache-friendly
  • Interpretable: Active bits correspond to trigram hashes

Why SQLite?

  • Zero setup: No server process, no configuration
  • WAL mode: Concurrent readers with a single writer
  • Portable: Database is a single file, easy to backup or move

Why Unix sockets?

  • Fast: No TCP overhead, no serialization beyond JSON
  • Secure: File-system permissions control access
  • Simple: Standard library support in Python