GitHub - alexherrero/agentm: Think of Agent M as the structural backend harness you wished you had—part Star Trek Computer, part J.A.R.V.I.S.-level contextual autonomy, engineered to manage your projects, memory, persistent knowledge and complex projects seamlessly across any modern agent surface, gaining experience and self-improving as it goes.

The agent harness that gives you the assistant you want — part Star Trek Computer, part J.A.R.V.I.S.

_{Works with Claude Code + Antigravity — see compatibility}

Think of Agent M as the structural backend harness you wished you had—part Star Trek Computer, part J.A.R.V.I.S.-level contextual autonomy, engineered to manage your projects, memory, and persistent knowledge across any modern agent surface, gaining experience and self-improving as it goes.

Imagine those workflows you saw in the movies. You're talking to your agent, "open a new project file for M" and off you go. Agent M remembers your projects and files together, talks with you about them, and learns and grows with you as you work. The context is self-maintaining — no time spent curating your own knowledge graph, and it can help with your personal notes too when you want it to.

This repo is the harness — the phase-gated workflow, auto-recall hooks, sub-agents, and on-disk state that make Agent M a system instead of a folder of files. It pairs with Crickets — a tactical suite of agent primitives (skills, hooks, sub-agents, bundles) that acts as the execution engine the harness installs into your target projects.

Latest: v4.12.0 (2026-06-01) — Cross-surface Agent M vault access (V4 #22; single-repo MINOR). Your AgentMemory vault is now readable natively from every agent surface you use, not just Claude Code. One canonical, paste-anywhere context payload (templates/agentmemory-context.md) teaches any surface how to navigate the vault; Claude.ai (Google Drive connector) and Claude Desktop (filesystem MCP server) read it read-only; Antigravity gets it as an installed rule — per-project (.agents/rules/) and globally (~/.gemini/GEMINI.md via install.sh --scope user), so it picks up the vault in every workspace with no per-project install. Read/write is surface-scoped — chat surfaces read-only, the filesystem working agents you run (Claude Code, Antigravity) may write. Also migrated the Antigravity adapter .agent/ → .agents/ (the 2.0 default) and made /doctor host-aware. All surfaces operator-dogfood-validated (Antigravity on both the CLI and the IDE). Gemini/ChatGPT/Codex deferred (chat-only, no live vault access yet). Configure-don't-build — no new MCP server, API, or daemon. Per ROADMAP-V4 item #22.
Prior: v4.11.0 (2026-05-30) — Opt-in --apply for personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only; --apply now writes the safe suggestions into a marked ## Related section per note — backup-first, idempotent, skipping unsafe-named + already-linked targets. Read-only stays the default; --apply is the operator-directed opt-in (A3). Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests. Per ROADMAP-V4 item #43.
Release notes · Agent M Evolution HLD · Device-Wide Architecture HLD · CHANGELOG

What's where

Piece	What it is
Agent M	The system as a whole — this repo + Crickets + your AgentMemory vault folder, working together
Harness (this repo)	Phase-gated workflow (`/setup` `/plan` `/work` `/review` `/release` `/bugfix`) + auto-recall + sub-agents + scripts
Crickets (`crickets`)	Skills, hooks, sub-agents, bundles — the primitives you install into your projects
AgentMemory vault	Your Obsidian markdown folder (synced via Google Drive / Dropbox / etc.) — agent reads at session start, writes under controlled conditions

Agent M is opinionated — small, not a 150-agent supermarket. It works with YOLO mode and other fully automated coding workflows, but it's designed for the ones that keep a human in the loop.

Why Agent M?

	Vanilla Claude Code	Claude Code + Agent M
Session continuity	Memory ends with the session; the next prompt starts blank	Vault-backed; new sessions auto-recall the entries relevant to where you left off
Per-phase auto-context	You re-explain conventions every time, or rely on a static `CLAUDE.md`	Each phase (`/setup` `/plan` `/work` `/review` `/release`) recalls phase-scoped entries within a token budget
Evidence-tracked task closeouts	Tasks close when the agent says they're done	`evidence-tracker` hook blocks `[ ] → [x]` flips in `PLAN.md` unless the agent actually read the spec/test files first
Paired-release coordination	Manual cross-repo coordination per release	Locked release-order convention + URL-linked sibling release notes + paired CI verification on both repos
Cross-project memory	Each project's `CLAUDE.md` lives in isolation	Vault holds operator-wide conventions + per-project sub-trees; the same locked decisions surface across every project you work in

Agent M doesn't replace Claude Code — it gives it persistence, structure, and the kind of accumulating context that turns a fresh session into a continuation.

Get started

Once both repos are cloned and the vault folder exists, Agent M is operational.

1. Install both repos as siblings

git clone https://github.com/alexherrero/agentm.git ~/Antigravity/agentm
git clone https://github.com/alexherrero/crickets.git    ~/Antigravity/crickets

2. Point the vault at your existing Obsidian + sync setup

mkdir -p "<sync-root>/AgentMemory/personal-private/_always-load"
mkdir -p "<sync-root>/AgentMemory/projects"
mkdir -p "<sync-root>/AgentMemory/_meta"
export MEMORY_VAULT_PATH="<sync-root>/AgentMemory"

Note

Pre-v4.1.0 vaults used personal-projects/ (renamed to projects/ in V4 #26). Existing operators should run bash agentm/scripts/rename-vault-personal-projects.sh after upgrading. The resolver chain transparently handles both layouts during transition — recall + save keep working either way.

Any sync layer works (Google Drive, Dropbox, syncthing).

3. Install the harness + Crickets bundle into your target project

# Harness (this repo) — slash commands, sub-agents, .harness/ state, AGENTS.md / CLAUDE.md, wiki/ scaffold
bash ~/Antigravity/agentm/install.sh [--hooks] /path/to/your-project

# Crickets bundle — evaluator sub-agent + 4 base hooks (kill-switch, steer, commit-on-stop, evidence-tracker) in one operation
bash ~/Antigravity/crickets/install.sh /path/to/your-project --bundle quality-gates

# Memory skill — /memory save / evolve / reflect / search / etc.
bash ~/Antigravity/crickets/install.sh /path/to/your-project --skill memory

Installations are idempotent; --hooks is opt-in for verification hooks. Windows: use install.ps1 with PowerShell 7+; same flag shape with -Hooks and -Update.

More install detail — seed your always-load entries + verify

4. Seed your always-load entries

Capture your locked conventions, coding-style rules, project invariants under <vault>/personal-private/_always-load/. One entry per concern. The first pass is co-created — you and the agent walk through it together; you approve each entry.

5. Verify

python3 ~/Antigravity/agentm/scripts/harness_memory.py recall --phase setup

Should print your always-load entries within the 4000-token budget.

Full install detail: wiki/how-to/Install-Into-Project.md.

How it works

flowchart LR
    U([You])
    H[Host<br/>Claude Code · Antigravity]
    A[Adapter<br/>commands · agents · skills]
    S[Canonical specs<br/>harness/]
    ST[(.harness/<br/>state)]
    W[(wiki/<br/>→ GitHub Wiki)]
    V[(AgentMemory<br/>vault — synced)]

    U -->|/slash command| H
    H --> A
    A --> S
    S --> ST
    S --> W
    S --> V
    V --> A

Phases

Command	Purpose
`/setup`	First-time project init — scaffold, `init.sh`, feature list, vault recall
`/plan`	Turn a brief into `.harness/PLAN.md` — tasks with pass/fail criteria
`/work`	Execute one task from the plan; evidence-tracked; update progress; stop
`/review`	Adversarial critique of the change — must produce executable artifact
`/release`	Pre-merge gate — clean tree, verification passes, changelog, paired-release coordination
`/bugfix`	Report → Analyze → Fix → Verify pipeline with GitHub Issue as posterity record

Every phase auto-recalls relevant entries from your AgentMemory vault at start, and offers to save new durable knowledge at exit. Self-modulating offer-save (confidence-thresholded) and cursor-tracked promotion keep the vault current without nagging you.

Skills shipped with the harness

Legacy single-file canonical skills (delivered via the per-host adapters/ pipeline):

Skill	What it does
`migrate-to-diataxis`	One-shot migration of an already-installed project's `wiki/` to the Diátaxis four-mode layout. Preview-first, `git mv` for blame, non-destructive. (Superseded by `diataxis-author` for new work; kept for legacy migration.)
`doctor`	User-invoked (`/doctor`). Verifies the install is correctly wired up in this host — structural by default, `--live` adds real sub-agent dispatches and skill dry-runs.

Compound skills imported from Crickets in v4.0.0 (V4 #36) — delivered via the manifest-walking dispatcher in install.sh / install.ps1:

Skill	What it does
`memory`	The Agent M memory skill itself. `/memory save` / `evolve` / `reflect` / `search` / `index-skills` / `discover-skills` / `adapt-skills` / `watchlist` / `promote`. Permeable A3 write boundary; collision-checked; supersession-not-deletion. Powers the recall + reflect hook loop.
`design`	Human-facing design pipeline → agent execution handoff. `/design author` walks a 10-section template; `/design translate` splits the approved design into structural parts; `/design sequence` generates a `PLAN.md` per part for `/work` + `/release` flow.
`diataxis-author`	Author + maintain a Diátaxis-style wiki for any repo. `/diataxis author` / `check` / `repair` / `migrate` / `classify`. Subsumes the harness's `migrate-to-diataxis` predecessor.
`ship-release`	Cut a tagged GitHub release with semver-driven version bumps from conventional commits. Writes CHANGELOG, tags, pushes, creates the release.

Hooks (claude-code only per ADR 0009):

Hook	What it does
`memory-recall-session-start`	SessionStart event → loads always-load entries from your vault into the agent's context (deduped, status-filtered, ~500ms budget).
`memory-recall-prompt-submit`	UserPromptSubmit event → keyword + vector-search recall of relevant entries based on the current prompt (~300ms budget; never blocks the prompt).
`memory-reflect-stop`	Stop event → mines the session transcript for durable-knowledge candidates (preferences, workflows, fixes, ideas); HIGH-confidence auto-saves to canonical paths, MEDIUM/LOW + ideas land in `_inbox/`.
`memory-reflect-idle`	SessionStart event → recovers orphan reflection markers from crashed sessions, processes deferred reflection candidates.
`evidence-tracker`	Default-FAIL evidence enforcement on `/work` task closeouts. Blocks `[ ]` → `[x]` flips in `PLAN.md` unless the agent demonstrably `Read` the spec/test files first. Hybrid resolver (heuristic + per-task override + explicit opt-out).

Sub-agents (imported in v4.0.0; existing 4 legacy agents at harness/agents/):

Sub-agent	What it does
`memory-idea-researcher`	Read-only deep-research worker for `_idea-incubator/` skeletons. Bounded wall-time / web-fetch / token budgets enforced from the skeleton's frontmatter.

Plugins (Antigravity 2.0 / agy v1.0.2+):

Plugin	What it does
`example-plugin`	Reference plugin showing the Antigravity 2.0 plugin manifest format. Install via `bash scripts/install-plugin.sh example-plugin`.

Base primitives + the 2 evaluator sub-agents (evaluator, diataxis-evaluator) + 3 operator-control hooks (kill-switch, steer, commit-on-stop) + 2 utility skills (pii-scrubber, dependabot-fixer) live in Crickets. (The memory-flow adapt-evaluator sub-agent moved to agentm in V4 #23 — memory primitives are agentm-native.) See ADR 0012 for the device-wide-by-default split rationale and ADR 0006 for the original split decision.

Telemetry

.harness/progress.md accumulates evidence of whether the harness is working. Run .harness/scripts/telemetry.sh for a per-project report or --all for multi-project. Signal definitions in harness/telemetry.md.

Repo structure

Top-level layout

agentm/
├── harness/          # canonical phase specs + harness-shipped skills (doctor, migrate-to-diataxis) + telemetry doc + principles
├── adapters/         # per-host wiring (claude-code/, antigravity/) — thin shims that point back at the canonical specs in harness/
├── wiki/             # Diátaxis-shaped docs (tutorials/ + how-to/ + reference/ + explanation/) — published as the GitHub Wiki
├── scripts/          # install helpers + smoke tests + harness_memory.py + manifest validators
├── templates/        # scaffolding (PLAN.md template, init.sh template) installed into target projects
├── assets/           # Agent M brand assets — logo, monogram, brand preview
├── lib/              # shared install plumbing — byte-identical to Crickets' lib/install/
├── AGENTS.md         # universal instructions for any AGENTS.md-aware host
├── CLAUDE.md         # Claude Code entry point — points back at AGENTS.md
├── install.sh        # POSIX installer (Linux + Mac)
└── install.ps1       # Windows installer (PowerShell 7+)

Architecture history

Agent M has grown over time across paired releases of agentm and crickets. The full V1→V4 evolution — what shipped, what's deferred, where the design is going — lives in Agent Memory Evolution on the Crickets side. V3 Retrospective covers what shipped, what we learned, what's deferred.

For the harness's design rationale, see harness/principles.md and the architecture decisions under wiki/explanation/decisions/.

Status

Currently shipping v4.11.0 — Opt-in --apply for personal-notes link-discovery (V4 #43 follow-up; single-repo MINOR). v4.10.0's audit was read-only; --apply now writes the safe suggestions into a marked ## Related section per note — backup-first (<vault>/_meta/notes-backup-<date>.tar.gz, refuses to apply if backup fails, prints the tar xzf revert), idempotent (re-running merges into the one section, never duplicates), skipping unsafe-named + already-linked targets in both directions. Read-only stays the default; --apply is the operator-directed opt-in (A3). A feedback-loop fix excludes the agent ## Related section from the TF-IDF/embedding scoring body so injected link text can't inflate similarity. An adversarial review caught + fixed destructive data loss — the agent block is now anchored to its %% … %% comment line (never a bare substring), so a prose mention of the tool or a human-authored ## Related is never clobbered. Dogfooded on the operator's own ~390-note vault (143 links across 64 notes + a one-time rename of 25 bracketed-date filenames). +9 tests (426 → 435). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.

Prior: v4.10.0 — Personal-notes link-discovery audit (V4 #43; single-repo MINOR). The read-only complement to v4.9.0's vault lint. Where vault_lint.py lints the agent-shaped AgentMemory/ entries and skips the operator's free-form notes, harness/skills/memory/scripts/notes_link_discovery.py audits those personal notes for missing connections between them — "these two notes look related but aren't [[linked]]." Two relatedness signals over the personal-notes corpus (the Obsidian root minus AgentMemory//.obsidian//.trash/.git): TF-IDF lexical overlap (always on) + embedding semantic similarity (opt-in --embeddings, local BGE via embed.py, cached at <vault>/_meta/notes-embeddings.json separate from the AgentMemory vec-index, graceful-skip to TF-IDF-only). A --report writes both signals side-by-side to <vault>/_meta/notes-links-<date>.md with paste-ready [[wikilinks]]; it never touches a personal note (A3, DC-1) and is strictly personal↔personal (DC-2). Live dogfood (392 notes) drove clip-HTML + bilingual-Spanish cleaning; the standout embedding-only catch — the same talk in [SP] and [EN] (cross-language, cosine 0.988) — is one TF-IDF structurally can't make. Three adversarial reviews caught three real fixes (small-corpus band collapse, a destructive --out hole, stale-dimension cache reuse), each regression-tested. +36 tests (406 → 426 per OS workflow). Deferred follow-ups: watcher/real-time re-embed (re-run is fine at this size); packaging into the crickets personal-notes bundle (V4 ④). Next ROADMAP picks: V4 #38 wiki bundle / #40 crickets-plugins-consolidation / #34 vault aesthetic pass. Single-repo release; crickets unaffected. See CHANGELOG.md and the latest release.

Contributing

Self-tested on every push by three per-OS workflows (Linux, Mac, Windows) running in parallel. Run the same gates locally with bash scripts/smoke-install-bash.sh. Details and the full invariant list in CONTRIBUTING.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What's where

Why Agent M?

Get started

How it works

Phases

Skills shipped with the harness

Telemetry

Repo structure

Architecture history

Status

Contributing

About

Uh oh!

Releases 44

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 246 Commits
.claude		.claude
.github		.github
adapters		adapters
assets		assets
harness		harness
lib/install		lib/install
scripts		scripts
templates		templates
wiki		wiki
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
install.ps1		install.ps1
install.sh		install.sh
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

What's where

Why Agent M?

Get started

How it works

Phases

Skills shipped with the harness

Telemetry

Repo structure

Architecture history

Status

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 44

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages