Skip to content

alexherrero/agentm

Repository files navigation

Agent M — The structural backend harness you wished you had

The agent harness that gives you the assistant you want — part Star Trek Computer, part J.A.R.V.I.S.

CI Latest release License: MIT

Works with Claude Code + Antigravity — see compatibility

Think of Agent M as the structural backend harness you wished you had—part Star Trek Computer, part J.A.R.V.I.S.-level contextual autonomy, engineered to manage your projects, memory, and persistent knowledge across any modern agent surface, gaining experience and self-improving as it goes.

Imagine those workflows you saw in the movies. You're talking to your agent, "open a new project file for M" and off you go. Agent M remembers your projects and files together, talks with you about them, and learns and grows with you as you work. The context is self-maintaining — no time spent curating your own knowledge graph, and it can help with your personal notes too when you want it to.

This repo is the harness — the phase-gated workflow, auto-recall hooks, sub-agents, and on-disk state that make Agent M a system instead of a folder of files. It pairs with Crickets — a tactical suite of agent primitives (skills, hooks, sub-agents, bundles) that acts as the execution engine the harness installs into your target projects.

Latest: v4.12.0 (2026-06-01) — Cross-surface Agent M vault access (V4 #22; single-repo MINOR). Your AgentMemory vault is now readable natively from every agent surface you use, not just Claude Code. One canonical, paste-anywhere context payload (templates/agentmemory-context.md) teaches any surface how to navigate the vault; Claude.ai (Google Drive connector) and Claude Desktop (filesystem MCP server) read it read-only; Antigravity gets it as an installed rule — per-project (.agents/rules/) and globally (~/.gemini/GEMINI.md via install.sh --scope user), so it picks up the vault in every workspace with no per-project install. Read/write is surface-scoped — chat surfaces read-only, the filesystem working agents you run (Claude Code, Antigravity) may write. Also migrated the Antigravity adapter .agent/.agents/ (the 2.0 default) and made /doctor host-aware. All surfaces operator-dogfood-validated (Antigravity on both the CLI and the IDE). Gemini/ChatGPT/Codex deferred (chat-only, no live vault access yet). Configure-don't-build — no new MCP server, API, or daemon. Per ROADMAP-V4 item #22.
Prior: v4.11.0 (2026-05-30) — Opt-in --apply for personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only; --apply now writes the safe suggestions into a marked ## Related section per note — backup-first, idempotent, skipping unsafe-named + already-linked targets. Read-only stays the default; --apply is the operator-directed opt-in (A3). Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests. Per ROADMAP-V4 item #43.
Release notes · Agent M Evolution HLD · Device-Wide Architecture HLD · CHANGELOG

What's where

Piece What it is
Agent M The system as a whole — this repo + Crickets + your AgentMemory vault folder, working together
Harness (this repo) Phase-gated workflow (/setup /plan /work /review /release /bugfix) + auto-recall + sub-agents + scripts
Crickets (crickets) Skills, hooks, sub-agents, bundles — the primitives you install into your projects
AgentMemory vault Your Obsidian markdown folder (synced via Google Drive / Dropbox / etc.) — agent reads at session start, writes under controlled conditions

Agent M is opinionated — small, not a 150-agent supermarket. It works with YOLO mode and other fully automated coding workflows, but it's designed for the ones that keep a human in the loop.

Why Agent M?

Vanilla Claude Code Claude Code + Agent M
Session continuity Memory ends with the session; the next prompt starts blank Vault-backed; new sessions auto-recall the entries relevant to where you left off
Per-phase auto-context You re-explain conventions every time, or rely on a static CLAUDE.md Each phase (/setup /plan /work /review /release) recalls phase-scoped entries within a token budget
Evidence-tracked task closeouts Tasks close when the agent says they're done evidence-tracker hook blocks [ ] → [x] flips in PLAN.md unless the agent actually read the spec/test files first
Paired-release coordination Manual cross-repo coordination per release Locked release-order convention + URL-linked sibling release notes + paired CI verification on both repos
Cross-project memory Each project's CLAUDE.md lives in isolation Vault holds operator-wide conventions + per-project sub-trees; the same locked decisions surface across every project you work in

Agent M doesn't replace Claude Code — it gives it persistence, structure, and the kind of accumulating context that turns a fresh session into a continuation.

Get started

Once both repos are cloned and the vault folder exists, Agent M is operational.

1. Install both repos as siblings

git clone https://github.com/alexherrero/agentm.git ~/Antigravity/agentm
git clone https://github.com/alexherrero/crickets.git    ~/Antigravity/crickets

2. Point the vault at your existing Obsidian + sync setup

mkdir -p "<sync-root>/AgentMemory/personal-private/_always-load"
mkdir -p "<sync-root>/AgentMemory/projects"
mkdir -p "<sync-root>/AgentMemory/_meta"
export MEMORY_VAULT_PATH="<sync-root>/AgentMemory"

Note

Pre-v4.1.0 vaults used personal-projects/ (renamed to projects/ in V4 #26). Existing operators should run bash agentm/scripts/rename-vault-personal-projects.sh after upgrading. The resolver chain transparently handles both layouts during transition — recall + save keep working either way.

Any sync layer works (Google Drive, Dropbox, syncthing).

3. Install the harness + Crickets bundle into your target project

# Harness (this repo) — slash commands, sub-agents, .harness/ state, AGENTS.md / CLAUDE.md, wiki/ scaffold
bash ~/Antigravity/agentm/install.sh [--hooks] /path/to/your-project

# Crickets bundle — evaluator sub-agent + 4 base hooks (kill-switch, steer, commit-on-stop, evidence-tracker) in one operation
bash ~/Antigravity/crickets/install.sh /path/to/your-project --bundle quality-gates

# Memory skill — /memory save / evolve / reflect / search / etc.
bash ~/Antigravity/crickets/install.sh /path/to/your-project --skill memory

Installations are idempotent; --hooks is opt-in for verification hooks. Windows: use install.ps1 with PowerShell 7+; same flag shape with -Hooks and -Update.

More install detail — seed your always-load entries + verify

4. Seed your always-load entries

Capture your locked conventions, coding-style rules, project invariants under <vault>/personal-private/_always-load/. One entry per concern. The first pass is co-created — you and the agent walk through it together; you approve each entry.

5. Verify

python3 ~/Antigravity/agentm/scripts/harness_memory.py recall --phase setup

Should print your always-load entries within the 4000-token budget.

Full install detail: wiki/how-to/Install-Into-Project.md.

How it works

flowchart LR
    U([You])
    H[Host<br/>Claude Code · Antigravity]
    A[Adapter<br/>commands · agents · skills]
    S[Canonical specs<br/>harness/]
    ST[(.harness/<br/>state)]
    W[(wiki/<br/>→ GitHub Wiki)]
    V[(AgentMemory<br/>vault — synced)]

    U -->|/slash command| H
    H --> A
    A --> S
    S --> ST
    S --> W
    S --> V
    V --> A
Loading

Phases

Command Purpose
/setup First-time project init — scaffold, init.sh, feature list, vault recall
/plan Turn a brief into .harness/PLAN.md — tasks with pass/fail criteria
/work Execute one task from the plan; evidence-tracked; update progress; stop
/review Adversarial critique of the change — must produce executable artifact
/release Pre-merge gate — clean tree, verification passes, changelog, paired-release coordination
/bugfix Report → Analyze → Fix → Verify pipeline with GitHub Issue as posterity record

Every phase auto-recalls relevant entries from your AgentMemory vault at start, and offers to save new durable knowledge at exit. Self-modulating offer-save (confidence-thresholded) and cursor-tracked promotion keep the vault current without nagging you.

Skills shipped with the harness

Legacy single-file canonical skills (delivered via the per-host adapters/ pipeline):

Skill What it does
migrate-to-diataxis One-shot migration of an already-installed project's wiki/ to the Diátaxis four-mode layout. Preview-first, git mv for blame, non-destructive. (Superseded by diataxis-author for new work; kept for legacy migration.)
doctor User-invoked (/doctor). Verifies the install is correctly wired up in this host — structural by default, --live adds real sub-agent dispatches and skill dry-runs.

Compound skills imported from Crickets in v4.0.0 (V4 #36) — delivered via the manifest-walking dispatcher in install.sh / install.ps1:

Skill What it does
memory The Agent M memory skill itself. /memory save / evolve / reflect / search / index-skills / discover-skills / adapt-skills / watchlist / promote. Permeable A3 write boundary; collision-checked; supersession-not-deletion. Powers the recall + reflect hook loop.
design Human-facing design pipeline → agent execution handoff. /design author walks a 10-section template; /design translate splits the approved design into structural parts; /design sequence generates a PLAN.md per part for /work + /release flow.
diataxis-author Author + maintain a Diátaxis-style wiki for any repo. /diataxis author / check / repair / migrate / classify. Subsumes the harness's migrate-to-diataxis predecessor.
ship-release Cut a tagged GitHub release with semver-driven version bumps from conventional commits. Writes CHANGELOG, tags, pushes, creates the release.

Hooks (claude-code only per ADR 0009):

Hook What it does
memory-recall-session-start SessionStart event → loads always-load entries from your vault into the agent's context (deduped, status-filtered, ~500ms budget).
memory-recall-prompt-submit UserPromptSubmit event → keyword + vector-search recall of relevant entries based on the current prompt (~300ms budget; never blocks the prompt).
memory-reflect-stop Stop event → mines the session transcript for durable-knowledge candidates (preferences, workflows, fixes, ideas); HIGH-confidence auto-saves to canonical paths, MEDIUM/LOW + ideas land in _inbox/.
memory-reflect-idle SessionStart event → recovers orphan reflection markers from crashed sessions, processes deferred reflection candidates.
evidence-tracker Default-FAIL evidence enforcement on /work task closeouts. Blocks [ ][x] flips in PLAN.md unless the agent demonstrably Read the spec/test files first. Hybrid resolver (heuristic + per-task override + explicit opt-out).

Sub-agents (imported in v4.0.0; existing 4 legacy agents at harness/agents/):

Sub-agent What it does
memory-idea-researcher Read-only deep-research worker for _idea-incubator/ skeletons. Bounded wall-time / web-fetch / token budgets enforced from the skeleton's frontmatter.

Plugins (Antigravity 2.0 / agy v1.0.2+):

Plugin What it does
example-plugin Reference plugin showing the Antigravity 2.0 plugin manifest format. Install via bash scripts/install-plugin.sh example-plugin.

Base primitives + the 2 evaluator sub-agents (evaluator, diataxis-evaluator) + 3 operator-control hooks (kill-switch, steer, commit-on-stop) + 2 utility skills (pii-scrubber, dependabot-fixer) live in Crickets. (The memory-flow adapt-evaluator sub-agent moved to agentm in V4 #23 — memory primitives are agentm-native.) See ADR 0012 for the device-wide-by-default split rationale and ADR 0006 for the original split decision.

Telemetry

.harness/progress.md accumulates evidence of whether the harness is working. Run .harness/scripts/telemetry.sh for a per-project report or --all for multi-project. Signal definitions in harness/telemetry.md.

Repo structure

Top-level layout
agentm/
├── harness/          # canonical phase specs + harness-shipped skills (doctor, migrate-to-diataxis) + telemetry doc + principles
├── adapters/         # per-host wiring (claude-code/, antigravity/) — thin shims that point back at the canonical specs in harness/
├── wiki/             # Diátaxis-shaped docs (tutorials/ + how-to/ + reference/ + explanation/) — published as the GitHub Wiki
├── scripts/          # install helpers + smoke tests + harness_memory.py + manifest validators
├── templates/        # scaffolding (PLAN.md template, init.sh template) installed into target projects
├── assets/           # Agent M brand assets — logo, monogram, brand preview
├── lib/              # shared install plumbing — byte-identical to Crickets' lib/install/
├── AGENTS.md         # universal instructions for any AGENTS.md-aware host
├── CLAUDE.md         # Claude Code entry point — points back at AGENTS.md
├── install.sh        # POSIX installer (Linux + Mac)
└── install.ps1       # Windows installer (PowerShell 7+)

Architecture history

Agent M has grown over time across paired releases of agentm and crickets. The full V1→V4 evolution — what shipped, what's deferred, where the design is going — lives in Agent Memory Evolution on the Crickets side. V3 Retrospective covers what shipped, what we learned, what's deferred.

For the harness's design rationale, see harness/principles.md and the architecture decisions under wiki/explanation/decisions/.

Status

Currently shipping v4.11.0 — Opt-in --apply for personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only; --apply now writes the safe suggestions into a marked ## Related section per note — backup-first (<vault>/_meta/notes-backup-<date>.tar.gz, refuses to apply if backup fails, prints the tar xzf revert), idempotent (re-running merges into the one section, never duplicates), skipping unsafe-named + already-linked targets in both directions. Read-only stays the default; --apply is the operator-directed opt-in (A3). A feedback-loop fix excludes the agent ## Related section from the TF-IDF/embedding scoring body so injected link text can't inflate similarity. An adversarial review caught + fixed destructive data loss — the agent block is now anchored to its %% … %% comment line (never a bare substring), so a prose mention of the tool or a human-authored ## Related is never clobbered. Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests (426 → 435). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.

Prior: v4.10.0 — Personal-notes link-discovery audit (V4 #43; single-repo MINOR). The read-only complement to v4.9.0's vault lint. Where vault_lint.py lints the agent-shaped AgentMemory/ entries and skips the operator's free-form notes, harness/skills/memory/scripts/notes_link_discovery.py audits those personal notes for missing connections between them — "these two notes look related but aren't [[linked]]." Two relatedness signals over the personal-notes corpus (the Obsidian root minus AgentMemory//.obsidian//.trash/.git): TF-IDF lexical overlap (always on) + embedding semantic similarity (opt-in --embeddings, local BGE via embed.py, cached at <vault>/_meta/notes-embeddings.json separate from the AgentMemory vec-index, graceful-skip to TF-IDF-only). A --report writes both signals side-by-side to <vault>/_meta/notes-links-<date>.md with paste-ready [[wikilinks]]; it never touches a personal note (A3, DC-1) and is strictly personal↔personal (DC-2). Live dogfood (392 notes) drove clip-HTML + bilingual-Spanish cleaning; the standout embedding-only catch — the same talk in [SP] and [EN] (cross-language, cosine 0.988) — is one TF-IDF structurally can't make. Three adversarial reviews caught three real fixes (small-corpus band collapse, a destructive --out hole, stale-dimension cache reuse), each regression-tested. +36 tests (406 → 426 per OS workflow). Deferred follow-ups: watcher/real-time re-embed (re-run is fine at this size); packaging into the crickets personal-notes bundle (V4 ④). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.

Contributing

Self-tested on every push by three per-OS workflows (Linux, Mac, Windows) running in parallel. Run the same gates locally with bash scripts/smoke-install-bash.sh. Details and the full invariant list in CONTRIBUTING.md.

About

Think of Agent M as the structural backend harness you wished you had—part Star Trek Computer, part J.A.R.V.I.S.-level contextual autonomy, engineered to manage your projects, memory, persistent knowledge and complex projects seamlessly across any modern agent surface, gaining experience and self-improving as it goes.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors