The agent harness that gives you the assistant you want — part Star Trek Computer, part J.A.R.V.I.S.
Works with Claude Code + Antigravity — see compatibility
Think of Agent M as the structural backend harness you wished you had—part Star Trek Computer, part J.A.R.V.I.S.-level contextual autonomy, engineered to manage your projects, memory, and persistent knowledge across any modern agent surface, gaining experience and self-improving as it goes.
Imagine those workflows you saw in the movies. You're talking to your agent, "open a new project file for M" and off you go. Agent M remembers your projects and files together, talks with you about them, and learns and grows with you as you work. The context is self-maintaining — no time spent curating your own knowledge graph, and it can help with your personal notes too when you want it to.
This repo is the harness — the phase-gated workflow, auto-recall hooks, sub-agents, and on-disk state that make Agent M a system instead of a folder of files. It pairs with Crickets — a tactical suite of agent primitives (skills, hooks, sub-agents, bundles) that acts as the execution engine the harness installs into your target projects.
Latest: v4.12.0 (2026-06-01) — Cross-surface Agent M vault access (V4 #22; single-repo MINOR). Your AgentMemory vault is now readable natively from every agent surface you use, not just Claude Code. One canonical, paste-anywhere context payload (
templates/agentmemory-context.md) teaches any surface how to navigate the vault; Claude.ai (Google Drive connector) and Claude Desktop (filesystem MCP server) read it read-only; Antigravity gets it as an installed rule — per-project (.agents/rules/) and globally (~/.gemini/GEMINI.mdviainstall.sh --scope user), so it picks up the vault in every workspace with no per-project install. Read/write is surface-scoped — chat surfaces read-only, the filesystem working agents you run (Claude Code, Antigravity) may write. Also migrated the Antigravity adapter.agent/→.agents/(the 2.0 default) and made/doctorhost-aware. All surfaces operator-dogfood-validated (Antigravity on both the CLI and the IDE). Gemini/ChatGPT/Codex deferred (chat-only, no live vault access yet). Configure-don't-build — no new MCP server, API, or daemon. Per ROADMAP-V4 item #22.
Prior: v4.11.0 (2026-05-30) — Opt-in--applyfor personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only;--applynow writes the safe suggestions into a marked## Relatedsection per note — backup-first, idempotent, skipping unsafe-named + already-linked targets. Read-only stays the default;--applyis the operator-directed opt-in (A3). Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests. Per ROADMAP-V4 item #43.
Release notes · Agent M Evolution HLD · Device-Wide Architecture HLD · CHANGELOG
| Piece | What it is |
|---|---|
| Agent M | The system as a whole — this repo + Crickets + your AgentMemory vault folder, working together |
| Harness (this repo) | Phase-gated workflow (/setup /plan /work /review /release /bugfix) + auto-recall + sub-agents + scripts |
Crickets (crickets) |
Skills, hooks, sub-agents, bundles — the primitives you install into your projects |
| AgentMemory vault | Your Obsidian markdown folder (synced via Google Drive / Dropbox / etc.) — agent reads at session start, writes under controlled conditions |
Agent M is opinionated — small, not a 150-agent supermarket. It works with YOLO mode and other fully automated coding workflows, but it's designed for the ones that keep a human in the loop.
| Vanilla Claude Code | Claude Code + Agent M | |
|---|---|---|
| Session continuity | Memory ends with the session; the next prompt starts blank | Vault-backed; new sessions auto-recall the entries relevant to where you left off |
| Per-phase auto-context | You re-explain conventions every time, or rely on a static CLAUDE.md |
Each phase (/setup /plan /work /review /release) recalls phase-scoped entries within a token budget |
| Evidence-tracked task closeouts | Tasks close when the agent says they're done | evidence-tracker hook blocks [ ] → [x] flips in PLAN.md unless the agent actually read the spec/test files first |
| Paired-release coordination | Manual cross-repo coordination per release | Locked release-order convention + URL-linked sibling release notes + paired CI verification on both repos |
| Cross-project memory | Each project's CLAUDE.md lives in isolation |
Vault holds operator-wide conventions + per-project sub-trees; the same locked decisions surface across every project you work in |
Agent M doesn't replace Claude Code — it gives it persistence, structure, and the kind of accumulating context that turns a fresh session into a continuation.
Once both repos are cloned and the vault folder exists, Agent M is operational.
1. Install both repos as siblings
git clone https://github.com/alexherrero/agentm.git ~/Antigravity/agentm
git clone https://github.com/alexherrero/crickets.git ~/Antigravity/crickets2. Point the vault at your existing Obsidian + sync setup
mkdir -p "<sync-root>/AgentMemory/personal-private/_always-load"
mkdir -p "<sync-root>/AgentMemory/projects"
mkdir -p "<sync-root>/AgentMemory/_meta"
export MEMORY_VAULT_PATH="<sync-root>/AgentMemory"Note
Pre-v4.1.0 vaults used personal-projects/ (renamed to projects/ in V4 #26). Existing operators should run bash agentm/scripts/rename-vault-personal-projects.sh after upgrading. The resolver chain transparently handles both layouts during transition — recall + save keep working either way.
Any sync layer works (Google Drive, Dropbox, syncthing).
3. Install the harness + Crickets bundle into your target project
# Harness (this repo) — slash commands, sub-agents, .harness/ state, AGENTS.md / CLAUDE.md, wiki/ scaffold
bash ~/Antigravity/agentm/install.sh [--hooks] /path/to/your-project
# Crickets bundle — evaluator sub-agent + 4 base hooks (kill-switch, steer, commit-on-stop, evidence-tracker) in one operation
bash ~/Antigravity/crickets/install.sh /path/to/your-project --bundle quality-gates
# Memory skill — /memory save / evolve / reflect / search / etc.
bash ~/Antigravity/crickets/install.sh /path/to/your-project --skill memoryInstallations are idempotent; --hooks is opt-in for verification hooks. Windows: use install.ps1 with PowerShell 7+; same flag shape with -Hooks and -Update.
More install detail — seed your always-load entries + verify
4. Seed your always-load entries
Capture your locked conventions, coding-style rules, project invariants under <vault>/personal-private/_always-load/. One entry per concern. The first pass is co-created — you and the agent walk through it together; you approve each entry.
5. Verify
python3 ~/Antigravity/agentm/scripts/harness_memory.py recall --phase setupShould print your always-load entries within the 4000-token budget.
Full install detail: wiki/how-to/Install-Into-Project.md.
flowchart LR
U([You])
H[Host<br/>Claude Code · Antigravity]
A[Adapter<br/>commands · agents · skills]
S[Canonical specs<br/>harness/]
ST[(.harness/<br/>state)]
W[(wiki/<br/>→ GitHub Wiki)]
V[(AgentMemory<br/>vault — synced)]
U -->|/slash command| H
H --> A
A --> S
S --> ST
S --> W
S --> V
V --> A
| Command | Purpose |
|---|---|
/setup |
First-time project init — scaffold, init.sh, feature list, vault recall |
/plan |
Turn a brief into .harness/PLAN.md — tasks with pass/fail criteria |
/work |
Execute one task from the plan; evidence-tracked; update progress; stop |
/review |
Adversarial critique of the change — must produce executable artifact |
/release |
Pre-merge gate — clean tree, verification passes, changelog, paired-release coordination |
/bugfix |
Report → Analyze → Fix → Verify pipeline with GitHub Issue as posterity record |
Every phase auto-recalls relevant entries from your AgentMemory vault at start, and offers to save new durable knowledge at exit. Self-modulating offer-save (confidence-thresholded) and cursor-tracked promotion keep the vault current without nagging you.
Legacy single-file canonical skills (delivered via the per-host adapters/ pipeline):
| Skill | What it does |
|---|---|
migrate-to-diataxis |
One-shot migration of an already-installed project's wiki/ to the Diátaxis four-mode layout. Preview-first, git mv for blame, non-destructive. (Superseded by diataxis-author for new work; kept for legacy migration.) |
doctor |
User-invoked (/doctor). Verifies the install is correctly wired up in this host — structural by default, --live adds real sub-agent dispatches and skill dry-runs. |
Compound skills imported from Crickets in v4.0.0 (V4 #36) — delivered via the manifest-walking dispatcher in install.sh / install.ps1:
| Skill | What it does |
|---|---|
memory |
The Agent M memory skill itself. /memory save / evolve / reflect / search / index-skills / discover-skills / adapt-skills / watchlist / promote. Permeable A3 write boundary; collision-checked; supersession-not-deletion. Powers the recall + reflect hook loop. |
design |
Human-facing design pipeline → agent execution handoff. /design author walks a 10-section template; /design translate splits the approved design into structural parts; /design sequence generates a PLAN.md per part for /work + /release flow. |
diataxis-author |
Author + maintain a Diátaxis-style wiki for any repo. /diataxis author / check / repair / migrate / classify. Subsumes the harness's migrate-to-diataxis predecessor. |
ship-release |
Cut a tagged GitHub release with semver-driven version bumps from conventional commits. Writes CHANGELOG, tags, pushes, creates the release. |
Hooks (claude-code only per ADR 0009):
| Hook | What it does |
|---|---|
memory-recall-session-start |
SessionStart event → loads always-load entries from your vault into the agent's context (deduped, status-filtered, ~500ms budget). |
memory-recall-prompt-submit |
UserPromptSubmit event → keyword + vector-search recall of relevant entries based on the current prompt (~300ms budget; never blocks the prompt). |
memory-reflect-stop |
Stop event → mines the session transcript for durable-knowledge candidates (preferences, workflows, fixes, ideas); HIGH-confidence auto-saves to canonical paths, MEDIUM/LOW + ideas land in _inbox/. |
memory-reflect-idle |
SessionStart event → recovers orphan reflection markers from crashed sessions, processes deferred reflection candidates. |
evidence-tracker |
Default-FAIL evidence enforcement on /work task closeouts. Blocks [ ] → [x] flips in PLAN.md unless the agent demonstrably Read the spec/test files first. Hybrid resolver (heuristic + per-task override + explicit opt-out). |
Sub-agents (imported in v4.0.0; existing 4 legacy agents at harness/agents/):
| Sub-agent | What it does |
|---|---|
memory-idea-researcher |
Read-only deep-research worker for _idea-incubator/ skeletons. Bounded wall-time / web-fetch / token budgets enforced from the skeleton's frontmatter. |
Plugins (Antigravity 2.0 / agy v1.0.2+):
| Plugin | What it does |
|---|---|
example-plugin |
Reference plugin showing the Antigravity 2.0 plugin manifest format. Install via bash scripts/install-plugin.sh example-plugin. |
Base primitives + the 2 evaluator sub-agents (evaluator, diataxis-evaluator) + 3 operator-control hooks (kill-switch, steer, commit-on-stop) + 2 utility skills (pii-scrubber, dependabot-fixer) live in Crickets. (The memory-flow adapt-evaluator sub-agent moved to agentm in V4 #23 — memory primitives are agentm-native.) See ADR 0012 for the device-wide-by-default split rationale and ADR 0006 for the original split decision.
.harness/progress.md accumulates evidence of whether the harness is working. Run .harness/scripts/telemetry.sh for a per-project report or --all for multi-project. Signal definitions in harness/telemetry.md.
Top-level layout
agentm/
├── harness/ # canonical phase specs + harness-shipped skills (doctor, migrate-to-diataxis) + telemetry doc + principles
├── adapters/ # per-host wiring (claude-code/, antigravity/) — thin shims that point back at the canonical specs in harness/
├── wiki/ # Diátaxis-shaped docs (tutorials/ + how-to/ + reference/ + explanation/) — published as the GitHub Wiki
├── scripts/ # install helpers + smoke tests + harness_memory.py + manifest validators
├── templates/ # scaffolding (PLAN.md template, init.sh template) installed into target projects
├── assets/ # Agent M brand assets — logo, monogram, brand preview
├── lib/ # shared install plumbing — byte-identical to Crickets' lib/install/
├── AGENTS.md # universal instructions for any AGENTS.md-aware host
├── CLAUDE.md # Claude Code entry point — points back at AGENTS.md
├── install.sh # POSIX installer (Linux + Mac)
└── install.ps1 # Windows installer (PowerShell 7+)
Agent M has grown over time across paired releases of agentm and crickets. The full V1→V4 evolution — what shipped, what's deferred, where the design is going — lives in Agent Memory Evolution on the Crickets side. V3 Retrospective covers what shipped, what we learned, what's deferred.
For the harness's design rationale, see harness/principles.md and the architecture decisions under wiki/explanation/decisions/.
Currently shipping v4.11.0 — Opt-in --apply for personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only; --apply now writes the safe suggestions into a marked ## Related section per note — backup-first (<vault>/_meta/notes-backup-<date>.tar.gz, refuses to apply if backup fails, prints the tar xzf revert), idempotent (re-running merges into the one section, never duplicates), skipping unsafe-named + already-linked targets in both directions. Read-only stays the default; --apply is the operator-directed opt-in (A3). A feedback-loop fix excludes the agent ## Related section from the TF-IDF/embedding scoring body so injected link text can't inflate similarity. An adversarial review caught + fixed destructive data loss — the agent block is now anchored to its %% … %% comment line (never a bare substring), so a prose mention of the tool or a human-authored ## Related is never clobbered. Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests (426 → 435). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.
Prior: v4.10.0 — Personal-notes link-discovery audit (V4 #43; single-repo MINOR). The read-only complement to v4.9.0's vault lint. Where vault_lint.py lints the agent-shaped AgentMemory/ entries and skips the operator's free-form notes, harness/skills/memory/scripts/notes_link_discovery.py audits those personal notes for missing connections between them — "these two notes look related but aren't [[linked]]." Two relatedness signals over the personal-notes corpus (the Obsidian root minus AgentMemory//.obsidian//.trash/.git): TF-IDF lexical overlap (always on) + embedding semantic similarity (opt-in --embeddings, local BGE via embed.py, cached at <vault>/_meta/notes-embeddings.json separate from the AgentMemory vec-index, graceful-skip to TF-IDF-only). A --report writes both signals side-by-side to <vault>/_meta/notes-links-<date>.md with paste-ready [[wikilinks]]; it never touches a personal note (A3, DC-1) and is strictly personal↔personal (DC-2). Live dogfood (392 notes) drove clip-HTML + bilingual-Spanish cleaning; the standout embedding-only catch — the same talk in [SP] and [EN] (cross-language, cosine 0.988) — is one TF-IDF structurally can't make. Three adversarial reviews caught three real fixes (small-corpus band collapse, a destructive --out hole, stale-dimension cache reuse), each regression-tested. +36 tests (406 → 426 per OS workflow). Deferred follow-ups: watcher/real-time re-embed (re-run is fine at this size); packaging into the crickets personal-notes bundle (V4 ④). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.
Self-tested on every push by three per-OS workflows (Linux, Mac, Windows) running in parallel. Run the same gates locally with bash scripts/smoke-install-bash.sh. Details and the full invariant list in CONTRIBUTING.md.
