Self-learning memory for Claude Code. Your agent gets smarter every session — automatically.
sixthsense extracts insights from every Claude Code conversation, scores them by reinforcement, synthesizes them into persistent memory, and auto-backports the strongest learnings into your skill files. Zero manual curation.
Built with agent-reverse — the capability extraction tool that helped bootstrap this entire system.
Session 1 Session 2 Session 3 ...
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Extract │ │ Extract │ │ Extract │ SessionEnd hook
│ learnings│ │ learnings│ │ learnings│ runs automatically
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ ▼
┌──────────────────────────────────────┐
│ SQLite: learnings.db │
│ │
│ Learnings + Signals (event stream) │
│ extracted → reinforced → recalled │
└──────────────┬───────────────────────┘
│
▼ (periodic / on-demand)
┌────────────────┐
│ Score Engine │ signal weights + time decay
│ score = Σw - d │
└───────┬────────┘
│
┌───────┴────────┐
│ │
▼ ▼
┌──────────┐ ┌────────────┐
│ Synthesize│ │ Backport │
│ → MEMORY │ │ → Skills │ score ≥ 6 → skill file
│ .md │ │ → CLAUDE │ score ≥ 8 → CLAUDE.md
└──────────┘ │ .md │
└─────────────┘
The conversation IS the review loop. When you correct Claude ("no, use X instead"), that's human input already captured as a learning. When the same insight keeps appearing across sessions, that's signal. When it crosses a threshold, it graduates to a skill file — permanently.
sixthsense combines ideas from three projects that pioneered different aspects of agent memory. This section documents what was adapted and how.
Source: Inside our in-house data agent — OpenAI's internal data agent architecture. Open-source implementation: agno-agi/dash by Ashpreet Bedi.
OpenAI built an internal data agent that uses six layers of context to navigate 600 petabytes of data. Each layer serves a different purpose:
- Schema Metadata — structural knowledge (column names, data types)
- Domain Expert Descriptions — curated semantics and business meaning
- Codex Enrichment — crawled codebase understanding
- Institutional Knowledge — Slack, Docs, Notion knowledge
- Learning Memory — corrections and nuances from past conversations
- Live Queries — real-time data when prior context is insufficient
What sixthsense adapted: The layered architecture concept. We mapped each layer to a Claude Code equivalent:
| OpenAI Layer | sixthsense equivalent |
|---|---|
| L1: Schema Metadata | CLAUDE.md (project instructions) |
| L2: Domain Descriptions | learnings.db (cross-session memory) |
| L3: Codex Enrichment | Score-ranked synthesis into MEMORY.md |
| L4: Institutional Knowledge | Recalled learnings in the current session |
| L5: Learning Memory | Tool failure tracking and pattern detection |
| L6: Live Queries | Backported learnings in skill files |
The key adaptation: making layers talk to each other automatically. In OpenAI's system, layers are queried independently. In sixthsense, a pattern at L5 (tool failures) gets stored at L2 (learnings), surfaced at L3 (synthesis), and promoted to L1 (skill files) — without human intervention.
Source: agno-agi/dash by Ashpreet Bedi. MIT License. Dash is a self-learning data agent that grounds its answers in 6 layers of context, inspired by OpenAI's in-house implementation.
Dash demonstrated that agent memory should be scored, not just stored. Raw facts decay without reinforcement. Dash implements this through a self-learning loop where every query improves future performance.
What sixthsense adapted: The signal-weighted scoring model. Dash showed that memory quality comes from reinforcement patterns, not just storage. We implemented this as an append-only signal stream:
| Signal | Weight | When |
|---|---|---|
extracted |
+1 | Learning first captured |
reinforced |
+2 | Same learning seen again (fuzzy match) |
recalled |
+2 | Learning cited in MEMORY.md synthesis |
corrected |
+3 | User correction matches existing topic |
applied |
+3 | Learning backported to a skill file |
Decay: -0.5 per 30 days since last signal. Unused knowledge fades. This decay mechanism ensures the memory stays fresh — an idea directly from Dash's approach to grounded, self-improving context.
Source: memvid/memvid (the core memory engine) and memvid/claude-brain (the Claude Code integration). MIT License. Memvid gives Claude Code "photographic memory" through compressed
.mv2memory files.
Memvid showed that agent memory retrieval doesn't need vector databases or embedding infrastructure. Their .mv2 format stores memories as compressed, indexed single files — searchable, portable, zero-dependency. claude-brain brought this to Claude Code specifically, proving that persistent cross-session memory is practical for coding agents.
What sixthsense adapted:
- SQLite over vectors: Following memvid's philosophy, learnings are stored as text in SQLite with signal metadata. No embeddings, no FAISS, no infrastructure. Just a single
.dbfile. - Fuzzy dedup over semantic search:
difflib.SequenceMatcherat 70% threshold catches paraphrased duplicates without ML models — similar to how memvid achieves retrieval without vector similarity. - Score ranking over similarity search: Instead of "find similar memories" (the vector DB approach), sixthsense asks "what are the strongest memories?" — a simpler, more reliable question. This mirrors memvid's design principle that smart indexing beats brute-force similarity.
- Single-file portability: memvid's
.mv2and sixthsense'slearnings.dbshare the same design goal — one file contains everything, copy it anywhere.
sixthsense extracts five categories from Claude Code transcripts:
| Category | Patterns | Example |
|---|---|---|
| learning | "I learned that...", "Turns out...", insight blocks | "transcript uses type not role for message type" |
| gotcha | "Watch out...", "This broke because...", "Doesn't work when..." | "WebFetch fails on authenticated sites" |
| decision | "Decided to...", "Going with...", "Using X instead of Y" | "going with SQLite over PostgreSQL for portability" |
| tool_error | Tool failures, errors, timeouts | "WebFetch timeout on large pages" |
| pivot | "Let me try...", "That didn't work..." | "switching to gh CLI instead of GitHub API" |
Not everything gets stored. Before insertion:
- Min length: Content > 30 chars (filters fragments)
- Max length: Content < 500 chars (filters dumps)
- Not a question: Skips content ending with "?"
- Not pure code: Skips if > 60% backtick content
- Fuzzy dedup: If 70%+ similar to an existing learning → emits
reinforcedsignal instead
git clone https://github.com/shihwesley/sixthsense.git
cd sixthsense
./install.sh /path/to/your/projectIf you use agent-reverse:
/agent-reverse analyze https://github.com/shihwesley/sixthsenseyour-project/.claude/
├── hooks/
│ └── extract-learnings.py # SessionEnd hook
├── scripts/
│ ├── score-learnings.py # Quality scoring engine
│ ├── synthesize-learnings.py # MEMORY.md generator
│ ├── backport-learnings.py # Skill file auto-updater
│ └── self-learn.sh # Pipeline orchestrator
├── commands/
│ └── self-learn.md # /self-learn slash command
├── cache/
│ └── learnings.db # SQLite database (auto-created)
└── settings.json # Hook configuration (merged)
- Copy
src/extract-learnings.pyto.claude/hooks/ - Copy remaining scripts from
src/to.claude/scripts/ - Copy
commands/self-learn.mdto.claude/commands/ - Add the SessionEnd hook to
.claude/settings.json:
{
"hooks": {
"SessionEnd": [
{
"type": "command",
"command": "python3 .claude/hooks/extract-learnings.py"
}
]
}
}- Python 3.8+ (standard library only — no pip dependencies)
- Claude Code CLI (for synthesis step)
- SQLite (bundled with Python)
After installation, sixthsense runs silently:
- Every session end: Extracts learnings from the transcript
- Periodically: Scores → synthesizes → backports (via LaunchAgent or cron)
The /self-learn command is installed to .claude/commands/self-learn.md:
| Command | What it does |
|---|---|
/self-learn |
Force a full pipeline run (score → synthesize → backport) |
/self-learn dry-run |
Preview what synthesis would produce |
/self-learn top |
Show top 10 learnings by quality score |
/self-learn backport |
Preview what would be backported to skill files |
/self-learn stats |
View learning statistics and signal distribution |
/self-learn migrate |
One-time migration for existing learnings |
# Force a synthesis run now
.claude/scripts/self-learn.sh --force
# Preview what synthesis would do
.claude/scripts/self-learn.sh --dry-run
# See your top learnings by score
python3 .claude/scripts/score-learnings.py --top 10
# See learnings as JSON
python3 .claude/scripts/score-learnings.py --top 20 --json
# Preview what would be backported to skill files
python3 .claude/scripts/backport-learnings.py --dry-run
# Migrate existing learnings (run once after install)
python3 .claude/hooks/extract-learnings.py --migrate# Count learnings by category
sqlite3 .claude/cache/learnings.db \
"SELECT category, COUNT(*) FROM learnings GROUP BY category"
# See signal distribution
sqlite3 .claude/cache/learnings.db \
"SELECT signal_type, COUNT(*) FROM learning_signals GROUP BY signal_type"
# Check backport audit trail
sqlite3 .claude/cache/learnings.db \
"SELECT target_file, content_added, score_at_backport FROM backport_log"| Variable | Default | Description |
|---|---|---|
SIXTHSENSE_DIR |
Auto-detected .claude/ |
Override the Claude directory |
SIXTHSENSE_MEMORY_FILE |
MEMORY.md in project memory dir |
Override MEMORY.md location |
SIXTHSENSE_BUDGET |
0.10 |
Max USD budget for Haiku synthesis (validated as numeric) |
SIXTHSENSE_DEBUG |
(unset) | Set to any value to enable debug logging to stderr |
Edit the scripts directly to adjust:
- Backport threshold:
DEFAULT_THRESHOLD = 6.0inbackport-learnings.py(skill files) - Generic threshold:
GENERIC_THRESHOLD = 8.0(CLAUDE.md backports) - Synthesis threshold:
MIN_NEW_LEARNINGS = 5insynthesize-learnings.py - Dedup threshold:
DEDUP_THRESHOLD = 0.7inextract-learnings.py
sixthsense is built on a fundamental belief: self-learning should be invisible. The agent should not ask the user "should I remember this?" or "was this useful?" — it should observe, score, and promote knowledge on its own.
This is why sixthsense uses hooks + background pipelines instead of MCP servers or interactive tools:
- Hooks are fire-and-forget. The SessionEnd hook runs after the conversation is over. No agent cooperation needed. No user prompt. No "would you like me to save this learning?" interruptions. The extraction just happens.
- Scoring is derived, not declared. The signal event stream accumulates evidence passively. A learning doesn't need a human to say "this is important" — if it keeps appearing across sessions, the score rises automatically.
- Backport is threshold-gated, not approval-gated. When a learning hits score ≥ 6, it moves to a skill file. No PR review. No "approve these changes?" dialog. The conversation history was the review — every time the user corrected or reinforced a behavior, that was signal.
Why not an MCP server? MCP servers are request-response tools — Claude calls them during a conversation. sixthsense's extraction happens after conversations, synthesis runs periodically, and backport runs after synthesis. Making this an MCP server would force the agent to remember to call extract_learnings() at the end of every session, defeating the "ambient" part entirely.
Why not ask the user? Because the user already told us. When they say "no, use X instead" — that's a correction signal. When they invoke the same skill three sessions in a row — that's reinforcement. When the same gotcha appears in two different projects — that's a pattern. The data is already there in the transcripts. Asking "should I save this?" adds friction to a process that should be frictionless.
We are at a stage where agentic systems can handle this kind of autonomous curation. The pattern recognition, fuzzy matching, and signal scoring that sixthsense does today is straightforward enough for automated pipelines to manage reliably. As models improve, these capabilities will only get more accurate.
We believe that the next iterations of Claude Code — and the Claude model family itself — will have self-learning, cross-session memory, and context layering built in natively. The capabilities sixthsense implements today (extraction, scoring, synthesis, backport) are features that belong in the platform, not in userland scripts.
sixthsense exists to prove the pattern works now, with today's tools. It's a bridge: a working implementation of what ambient agent learning looks like, built with hooks and SQLite and shell scripts, waiting to be replaced by something better that ships with the platform.
When that day comes, sixthsense will have served its purpose — and the learnings it accumulated along the way will have already been backported into your skill files, ready for whatever comes next.
sixthsense is designed to be near-zero cost in normal operation. The only step that calls an LLM is synthesis — everything else runs locally.
| Pipeline Step | LLM Call? | Cost |
|---|---|---|
| Extract learnings | No | Free — regex + SQLite |
| Score learnings | No | Free — arithmetic on signals |
| Synthesize MEMORY.md | Yes — Haiku | ~$0.01-0.05 per run |
| Backport to skills | No | Free — file append |
Cost controls built in:
- Cheapest model: Synthesis uses
claude --model haiku— the fastest, cheapest option. No need for Opus/Sonnet to summarize 20 bullet points. - Hard budget cap:
--max-budget-usd 0.10(configurable viaSIXTHSENSE_BUDGET). The CLI enforces this — synthesis stops if the budget is exceeded. - Threshold gating: Synthesis only triggers when 5+ new learnings have accumulated since the last run. Below that, the pipeline exits immediately — no API call made.
- Bounded input: Only the top 20 learnings (by quality score) are sent to the LLM. The prompt is compact — typically under 1,000 tokens.
- Output truncation: MEMORY.md is capped at 200 lines to stay within Claude Code's context budget when loaded on session start.
- Rate limit tracking: The extraction hook detects
429/ rate limit errors from transcripts and logs them totool_failuresfor observability.
In practice, if you run synthesis once per day, sixthsense costs less than $1/month in API usage. The 12-hour LaunchAgent interval means at most 2 synthesis calls per day.
The core data model is an append-only event stream (adapted from Dash's self-learning loop). Instead of mutating a "score" field directly, every interaction that affects a learning's importance is recorded as a signal event. Scores are derived from signals — never stored as ground truth.
This means you can:
- Change signal weights retroactively (just re-run scoring)
- Add new signal types without migration
- Audit exactly why a learning has a given score
- Replay the full history of a learning's lifecycle
When a learning graduates to a skill file, it's wrapped in HTML comment markers:
## Auto-Learned Notes
<!-- sixthsense:section — auto-managed, do not edit above this line -->
<!-- sixthsense:backport:start id=learning_42 score=8.5 date=2026-02-05 -->
> **Learned:** WebFetch fails on authenticated sites — ask user to paste content, don't retry.
<!-- sixthsense:backport:end -->Following memvid's single-file philosophy, everything lives in one SQLite database:
-- Core: what was learned
learnings (id, project_path, category, content, source, primary_skill, tags,
quality_score, backported, created_at)
-- Events: what happened to each learning (signal stream inspired by Dash)
learning_signals (id, learning_id, signal_type, session_id, context, created_at)
-- Audit: what was promoted
backport_log (id, learning_id, target_file, content_added, score_at_backport, created_at)
-- Observability: session-level stats
session_stats (id, session_id, project_path, total_messages, assistant_messages,
tool_calls, tool_errors, created_at)
-- Diagnostics: tool failure patterns
tool_failures (id, session_id, tool_name, error_type, error_message, project_path, created_at)sixthsense is a derivative work that combines ideas from three open-source projects. We are grateful to their authors and want to be transparent about what was adapted:
| Project | Author(s) | License | What sixthsense adapted |
|---|---|---|---|
| Dash | Ashpreet Bedi / agno-agi | MIT | Six-layer context architecture, signal-weighted scoring model, self-learning loop concept |
| memvid | Memvid team | MIT | Single-file memory philosophy, compressed retrieval without vectors, fuzzy matching over embeddings |
| claude-brain | Memvid team | MIT | Claude Code memory integration patterns, .mv2-style persistent cross-session memory, MEMORY.md as the retrieval surface |
The six-layer context architecture was originally developed by OpenAI for their internal data agent: Inside our in-house data agent. Dash (agno-agi) created the first open-source implementation of this architecture.
- agent-reverse — Capability extraction MCP server used throughout sixthsense development to analyze repos, extract skill patterns, and manage the build process.
MIT — see LICENSE.