Skip to content

Nishal77/Codebreif

Repository files navigation

codebrief

The surgical context layer for AI coding agents.

Stop writing CLAUDE.md by hand. Stop letting it go stale. Let your code write it.

npm version License: MIT PRs Welcome TypeScript No LLM

npx codebrief init

30 seconds. Works with Claude Code, Cursor, Copilot, Windsurf, and Aider. Always current. Never hallucinated.


Table of Contents


What is codebrief

codebrief is an open source CLI tool that generates a minimal, always-accurate context file for AI coding agents. It works automatically from pure static analysis of your actual codebase. No LLM. No API keys. No hallucination.

When you use an AI coding agent like Claude Code, Cursor, or Copilot, the agent reads a context file at the start of every session. This file tells it how your project works: what commands to run, which modules must never import each other, which files are currently unstable, what decisions your team has made. Without it, the agent makes assumptions. With a stale one, it follows outdated instructions and produces wrong code.

codebrief generates that file for you. It reads your code, your Git history, your config files, and your import graph. It extracts only what the agent cannot figure out itself. It writes everything into a clean, under-80-line CODEBRIEF.md and updates it automatically on every commit through a pre-commit hook it installs for you.

Think of it as the missing link between your repository and the AI tools that work with it.


The problem it solves

Every major AI coding tool reads a context file today. Most developers write and maintain these files by hand across six different formats.

Tool File it reads
Claude Code CLAUDE.md
OpenAI Codex AGENTS.md
Cursor .cursorrules
GitHub Copilot .github/copilot-instructions.md
Windsurf .windsurfrules
Gemini CLI GEMINI.md

The hand-written approach breaks down in five ways.

Fragmentation. Six tools, six identical files, six places to forget to update when your architecture changes.

Context rot. A file that described your architecture three months ago now describes a codebase that no longer exists. The agent follows outdated instructions and produces wrong output.

LLM-generated bloat. Tools that auto-generate these files using an LLM make things worse, not better. ETH Zurich's 2026 research proved this reduces agent success rates by 3% and adds 20% to inference costs.

Token waste. Every line in every context file loads into every agent session, whether or not it is relevant to the current task.

Wrong content. Developers write codebase maps and stack summaries. Agents need surgical commands: the exact test flags, the import boundaries that must never be crossed, the files currently in active refactor.

codebrief solves all five. It generates one canonical file from code, not prose. It keeps it current automatically. It generates all six tool formats from that one source. And it scopes output to what is actually relevant.


How it works

codebrief generate

Six stages run in parallel under the hood. On a 50,000-line codebase this completes in under two seconds. Nothing leaves your machine.

Stage 1   Inventory      Walk all files, detect languages, fingerprint frameworks
Stage 2   AST Parse      Build import graph with ts-morph, detect architectural layers,
                         infer boundaries from directional import patterns
Stage 3   Git Analysis   Read 30 days of history, map change frequency per file,
                         detect reverts, find WIP markers in commit messages
Stage 4   Config Parse   Read package.json scripts, tsconfig path aliases,
                         ESLint enforced rules, Makefile targets
Stage 5   Synthesis      Combine all outputs, deduplicate against existing docs,
                         apply scope filter if running from pre-commit hook
Stage 6   Format         Write CODEBRIEF.md, generate all enabled tool adapter files

Every insight comes from facts in your repository. The analyzer cannot invent a module name that does not exist or describe an architecture that was never implemented. That is the core guarantee LLM-based generation cannot make.


What gets extracted

Before writing any line, codebrief applies one test: could an AI agent discover this fact by reading the code, running ls, or checking package.json? If yes, it gets skipped.

Category What codebrief surfaces What it deliberately skips
Commands Non-obvious flags, custom codegen scripts, Makefile targets with side effects, scripts that must run in a specific order npm start, npm test, standard lifecycle scripts agents already know
Architecture Import boundaries inferred from actual usage patterns, actively violated boundaries, deprecated module paths Module list, folder structure, file counts visible from ls
Danger zones Files in active migration, recently reverted modules, files touched by three or more contributors this week Stable, well-tested modules agents can navigate without warnings
Decisions Non-obvious tech choices with context behind them, recently merged architectural pivots from Git log Stack summary, framework versions already in package.json
Conventions Rules that would surprise a competent engineer, patterns agents commonly get wrong in this specific codebase Standard style rules already enforced by ESLint or Prettier

Output example

Here is what gets written for a real NestJS and Prisma project:

<!-- CODEBRIEF.md — generated by codebrief v0.1.0 -->
<!-- Updated: 2026-04-16 10:23 UTC | 47 lines | codebrief v0.1.0 -->

## Commands
- **test**: `bun test --reporter verbose --coverage`
- **codegen**: `bun run codegen` — run BEFORE build, generates Prisma client types
- **seed**: `bun run db:seed --env=test` — required before any integration tests

## Architecture Boundaries
- `api/**` never imports from `infra/**`
- `core/**` never imports from `api/**`
- Data access always through `core/repos/**`, never raw Prisma calls in API layer

## Active Danger Zones
- `src/auth/session.ts` — recently reverted, contains instability (since 2026-04-10)
- `src/billing/**` — active Stripe webhook refactor, 4 contributors this week

## Non-Obvious Conventions
- Use `z.parse()` not `z.safeParse()` — errors handled at the boundary layer
- Named exports only, no default exports — enforced by ESLint but worth stating for new modules
- Redis cache keys: prefix with `service:entity:id` — see src/cache/keys.ts:12

## Recent Decisions
- [2026-04-12] Switched state management from Redux to Zustand — see ADR-007.md
- [2026-04-05] Deprecated /pages directory — App Router only, no new files in /pages

<!-- Tools: CLAUDE.md   .cursorrules   copilot-instructions.md   AGENTS.md -->

47 lines. No directory tree. No stack summary. No generic advice. Every line is a fact the agent cannot find anywhere else in the repository.

A typical auto-generated CLAUDE.md runs 200 lines of folder structure, framework descriptions, and generic guidelines, all of which the agent can already see. codebrief writes only what agents actually need.


Installation

Global install, recommended for teams and daily use

npm install -g codebrief
pnpm add -g codebrief

One-time use via npx, no install required

npx codebrief init

Per-project as a dev dependency

npm install -D codebrief
pnpm add -D codebrief
yarn add -D codebrief

Requirements

  • Node.js 18 or higher
  • Git initialized in the project
  • Works with TypeScript, JavaScript, or mixed repositories

Getting started

Run this once inside any existing project:

cd your-project
codebrief init

The init command does four things automatically:

  1. Runs a full static analysis of the current repository
  2. Generates CODEBRIEF.md in the project root
  3. Creates all enabled tool adapter files including CLAUDE.md, .cursorrules, .github/copilot-instructions.md, and AGENTS.md
  4. Installs a pre-commit Git hook so every future commit updates the brief automatically

From that point on you never touch these files by hand. Every commit keeps them current.

If you want to run the analysis without installing the hook, use codebrief generate directly.


CLI commands

Command What it does
codebrief init First-time setup. Creates codebrief.config.ts, installs the Git pre-commit hook, and runs an initial analysis.
codebrief generate Full analysis and regeneration of CODEBRIEF.md and all tool adapter files. Runs in under 2 seconds.
codebrief watch Daemon mode. Watches for file changes and auto-regenerates affected sections with a 1-second debounce.
codebrief diff Shows what changed in CODEBRIEF.md since the last Git commit. Useful when reviewing PRs.
codebrief scope Prints a scoped brief for the given file paths to stdout. Called internally by the pre-commit hook.
codebrief validate Exits with code 1 if CODEBRIEF.md is stale. Designed to run in CI pipelines.

Common options

codebrief generate --scope "src/api/users.ts src/core/user.service.ts"
codebrief generate --quiet       # suppress all output except errors
codebrief generate --repo /path  # override the repository root

Tool support

codebrief writes one canonical CODEBRIEF.md and generates all tool-specific formats from it automatically. One source of truth, six tools served. All files update together on every commit.

AI Tool Generated file Notes
Claude Code CLAUDE.md Enabled by default
OpenAI Codex and compatible agents AGENTS.md Enabled by default
Cursor .cursorrules Plain text format, HTML comments stripped
GitHub Copilot .github/copilot-instructions.md Directory created automatically if missing
Windsurf .windsurfrules Opt-in via adapters.windsurf: true in config
Gemini CLI GEMINI.md Opt-in via adapters.gemini: true in config

Do not edit individual adapter files manually. They get overwritten on the next codebrief generate run. All customizations belong in codebrief.config.ts.


MCP server

codebrief ships as a Model Context Protocol server. Any MCP-compatible agent can call the server to get context dynamically, scoped to specific files rather than loading the full brief on every request.

Add to Claude Desktop config

{
  "mcpServers": {
    "codebrief": {
      "command": "npx",
      "args": ["@codebrief/mcp-server"]
    }
  }
}

For a custom repository path:

{
  "mcpServers": {
    "codebrief": {
      "command": "npx",
      "args": ["@codebrief/mcp-server"],
      "env": {
        "CODEBRIEF_REPO": "/absolute/path/to/your/project"
      }
    }
  }
}

Three tools are exposed over stdio transport

codebrief_get returns the full brief or a named section. Call this once at session start to orient the agent.

{ "format": "full" }
{ "format": "commands" }
{ "format": "architecture" }
{ "format": "conventions" }

codebrief_scope returns only the context relevant to a specific set of files. This is the token-efficient path. Use it before editing a file rather than loading the entire brief.

{ "files": ["src/api/users.ts", "src/core/user.service.ts"] }

codebrief_check checks whether a specific file is a danger zone before editing. It returns a warning with severity and reason if it is, or a clear confirmation if it is safe.

{ "path": "src/auth/session.ts" }

The scoped tool eliminates the overhead of loading 80 lines of context on every agent interaction. Only the boundaries, danger zones, and conventions relevant to the files being touched are returned.


Pre-commit hook

The pre-commit hook is what makes codebrief truly set-and-forget. Running codebrief init writes the following script to .git/hooks/pre-commit and makes it executable automatically.

#!/bin/sh
# codebrief pre-commit hook
# Installed by: codebrief init
# To reinstall, run: codebrief init --force

STAGED=$(git diff --cached --name-only --diff-filter=ACM)

if [ -z "$STAGED" ]; then
  exit 0
fi

npx codebrief generate --scope "$STAGED" --quiet || true

git add CODEBRIEF.md CLAUDE.md .cursorrules \
  .github/copilot-instructions.md AGENTS.md 2>/dev/null || true

exit 0

Scoped regeneration. The hook passes staged file paths to the generator so only the sections of the brief affected by those files get recalculated. This keeps the hook fast on large codebases, typically under 500ms.

Non-blocking by design. Both the generate call and the git add use || true. If codebrief fails for any reason, the commit proceeds unchanged. The developer's work is never blocked by the context update.

Respects existing hooks. If a pre-commit hook already exists when you run codebrief init, the tool appends to it rather than overwriting. The hook is checked for the codebrief signature before touching anything.

To reinstall or reset the hook at any time:

codebrief init --force

Configuration

codebrief works with zero configuration on any TypeScript, JavaScript, or mixed repository. Just run codebrief init and it works.

For teams that need to customize behavior, codebrief init creates a fully typed codebrief.config.ts at the project root. Every field is optional.

import type { BriefConfig } from '@codebrief/core';

const config: BriefConfig = {
  // Glob patterns to exclude from analysis.
  // node_modules, .git, and dist are always excluded regardless of this setting.
  exclude: ['src/generated/**', 'src/migrations/**'],

  // Explicit architectural boundaries.
  // codebrief infers these from import patterns automatically.
  // Use this to declare boundaries that do not yet appear in code,
  // or to enforce ones that are only partially respected.
  boundaries: [
    { from: 'api', to: 'infra', rule: 'never' },
    { from: 'core', to: 'api', rule: 'never' },
  ],

  // Files to always flag as danger zones regardless of Git activity.
  dangerZones: [
    {
      path: 'src/auth',
      reason: 'In migration to new auth provider, do not add logic here',
      severity: 'high',
    },
  ],

  // Days of Git history to analyze for danger zone detection. Default is 30.
  gitLookbackDays: 30,

  // Maximum lines in the generated CODEBRIEF.md.
  // Content is prioritized by signal value and trimmed to this limit. Default is 80.
  maxLines: 80,

  // Minimum number of unique authors on a file within the lookback window
  // before it gets automatically flagged as a danger zone. Default is 3.
  dangerZoneAuthorThreshold: 3,

  // Tool adapters to generate alongside CODEBRIEF.md.
  adapters: {
    claude: true,    // writes CLAUDE.md
    cursor: true,    // writes .cursorrules
    copilot: true,   // writes .github/copilot-instructions.md
    agents: true,    // writes AGENTS.md
    windsurf: false, // writes .windsurfrules, opt in if your team uses Windsurf
    gemini: false,   // writes GEMINI.md, opt in if your team uses Gemini CLI
  },
};

export default config;

Language support

codebrief performs deep AST analysis on TypeScript and JavaScript using ts-morph. For all other languages it uses tree-sitter WASM bindings, which require no native compilation and work in any Node.js environment.

Language Import analysis Boundary detection Status
TypeScript Full AST with type resolution Yes, inferred from import patterns Stable
JavaScript (ESM and CJS) Full AST Yes, inferred from import patterns Stable
Python tree-sitter Planned Open for community contribution
Go tree-sitter Planned Open for community contribution
Rust tree-sitter Planned Open for community contribution
Java tree-sitter Planned Open for community contribution
Ruby tree-sitter Planned Open for community contribution

Git analysis, command extraction, and config parsing work on any repository regardless of language. Language-specific AST analysis adds import boundary detection and convention inference on top of that base.


Contributing

codebrief is designed to be contributed to. Every analyzer is a self-contained module. Adding a new language, a new tool adapter, or a new detection pattern does not require understanding the full system. You only need to understand the module you are adding.

Read CONTRIBUTING.md before opening a PR.

Good first issues

Each of these is isolated with no dependencies on other open work. Every one ships real value on its own.

Task Effort What to build
ConventionAnalyzer Medium Detect naming patterns and type usage from the AST
DecisionAnalyzer Easy Parse ADR files and CHANGELOG.md for team decisions
Windsurf adapter Easy Format the brief for .windsurfrules
Gemini adapter Easy Format the brief for GEMINI.md
GitHub Action Easy Wrap codebrief validate as a reusable CI step
Python language support Medium tree-sitter WASM for Python import analysis
Go language support Medium tree-sitter WASM for Go module analysis
Rust language support Medium tree-sitter WASM for Rust crate analysis
VS Code extension Hard Sidebar view, live updates, inline danger zone indicators

Quick start for contributors

# Fork then clone your fork
git clone https://github.com/your-username/Codebreif.git
cd Codebreif

# Install dependencies (pnpm is required for this monorepo)
pnpm install

# Build all packages
pnpm build

# Run the CLI against this repo as a smoke test
node packages/cli/dist/index.js generate

# Run the full test suite
pnpm test

Standards every PR must meet

  • pnpm build passes with no TypeScript errors. strict: true is enforced throughout.
  • pnpm test passes against real fixture repositories, not mocks.
  • No any types. Use unknown with type guards at system boundaries.
  • JSDoc on all exported functions and types.
  • Error messages tell users what to do, not just what went wrong.
  • README.md updated if a new command, adapter, or config field was added.

Research foundation

codebrief is grounded in peer-reviewed research, not intuition.

Study: arXiv:2602.11988 Published by: ETH Zurich, February 2026 Scale: 138 repositories, four AI coding agents, four context file conditions

The study compared agent task success rates and inference costs across four conditions: no context file, human-written minimal, human-written comprehensive, and LLM-generated.

Condition Task success Inference cost
No context file Baseline Baseline
Human-written minimal +4% No change
Human-written comprehensive -1% Slightly higher
LLM-generated -3% +20%

The mechanism the paper identified: agents follow instructions literally. Mentioning a tool causes agents to use it 1,600% more often. Describing directory structure causes agents to search more files than necessary. More instructions means more steps, which means more cost, and often worse results.

The paper's conclusion in section 6.1 explicitly called for a tool that generates minimal, factual, static-analysis-derived context files automatically. codebrief is the implementation of that recommendation.


License

MIT. See LICENSE for the full text.

You can use codebrief in commercial projects, fork it, modify it, and redistribute it freely. The only requirement is that you preserve the copyright notice.


Built by and for the community

codebrief was started by @nishalbuilds in April 2026, directly motivated by the ETH Zurich research. It is MIT licensed and structured from the first commit to invite contribution.

The goal is not to build a product. The goal is to establish a standard. Every developer using AI coding tools deserves a context file that is current, minimal, and generated from their actual code, not written by hand, not generated by an LLM, not a maintenance burden. This is infrastructure for the AI-assisted development era and it belongs to everyone who builds software.

If codebrief has improved your workflow, contribute something back. A bug fix, a language plugin, a test fixture, a documentation improvement, or just telling another developer it exists. Every contribution compounds.


Built by @nishalbuilds with the community in mind

Report a bug   Request a feature   Contributing guide

MIT License

About

ETH Zurich proved stale context files hurt AI agents. codebrief solves it. Auto-generates CLAUDE.md & .cursorrules via static analysis. No LLM. No drift.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors