Java Harness Agent 🚀

An Agent-Driven "Microkernel" Operating System for Backend Engineering

⚠️ Critical Positioning Statement

"Learning Agent architecture is like re-learning Operating Systems. History doesn't repeat, but it rhymes!"

This repository is a machine-to-machine (M2M) infrastructure. It is a Cognitive Harness—an executable protocol designed by humans, but read, interpreted, and executed exclusively by Large Language Models (LLMs).

Unlike traditional Agent frameworks that act as bloated "Macro-kernels," Java Harness Agent adopts an extremely restrained Microkernel OS Philosophy:

Process = Intent Boundary: Cross-intent requires explicit communication (WAL Write-back).
RAM = Context Window: Strictly scheduled by the architecture.
System Calls = Tool Use: Traps into the kernel via system calls, authenticated by the Role Matrix.
File System = RAG & Wiki: Mounted on demand, burned after use.

Java Harness Agent is an agent-driven backend engineering workflow designed for sustainable software evolution. It deeply integrates Cognitive Philosophy (counter-intuitive bias checks, first-principles thinking) and pioneers a Dual-Track Flow with a 4-Level Risk Matrix. Driven by a rich ecosystem of high-density Master Skills, it completely eliminates "runaway code" and "architecture rot" common in traditional Agent development.

📖 Overview

Java Harness Agent fuses the "Contract-First" OpenSpec design philosophy with a Microkernel architecture. Through its Intent Gateway, Dual-Track Lifecycle, Vector-less Knowledge Graph (LLM Wiki), and Cognitive Brakes, it achieves a sustainable, interruptible, and self-correcting engineering closed-loop.

✨ Key Features

🎯 OS-Level Intent Driven: Natural language → Structured intent queues → Process-level task scheduling
🧠 Cognitive Philosophy: Built-in cognitive bias correction and 5-Whys decision frameworks, forcing the Agent to "think thrice" (Cognitive Brake) before acting.
🛤️ Dual-Track & 4-Level Risk Matrix: Differentiates between TRIVIAL (Fast-path), LOW (PATCH track), and MEDIUM/HIGH (STANDARD full 6-phase), abandoning one-size-fits-all cumbersome processes.
📚 Microkernel Knowledge Graph: Completely discards the "black box" of vector databases, utilizing a pure Markdown hierarchical mounting system to ensure 100% context determinism.
🛡️ Self-Correcting & Gating: Automatic guard hooks, failure recovery, and mandatory human-in-the-loop checkpoints (Approval Gate).
🔌 Rich Skill Ecosystem: Organized across 6 categories (Defaults, Role-Required, Business, Engineering Pipeline, Java Standards, QA & Debugging, Workflow, Meta), covering every lifecycle phase.

💰 Token Economics & Cost Model

Given that Java Harness Agent is a strongly-constrained framework, its architecture shifts costs from "Trial & Error / Blind Search" to "Upfront Planning & Gating Defenses", resulting in highly predictable and stable overall costs for complex tasks.

1. The "Thinking Tax"

Each turn requires the LLM to output the <Cognitive_Brake> and read mandatory system contexts. This adds a fixed baseline "thinking tax" of ~500 Output Tokens and ~2000 Input Tokens per interaction.
With the integration of the Cognitive Framework, the Agent must first self-reflect (anti-bias check), adding a few hundred tokens upfront but saving tens of thousands of tokens otherwise wasted on architectural rewrites.

2. The ROI: Comparing 3 Paradigms

Paradigm	Behavior	Input Tokens	Output Tokens	Hidden Costs / Risks	Verdict
Pure Chat / Copilot	Jumps straight to coding with limited context.	~5k	~1k	High Rework Rate. Misses transaction boundaries, forgets existing enums. Requires human prompt corrections.	Cheap in Tokens, Expensive in Human Time.
Macro-kernel Auto-Agent	Blindly searches, loads all skills at once, loops endlessly on compile errors.	100k+	10k+	Disastrous. Burns through budget via massive context bloat and infinite loops.	Unpredictable & Dangerous.
Microkernel Harness Agent	Pays the "Thinking Tax", utilizes Dual-Track and funnel throttling, STOPs at high-risk gates.	~30k	~6k	Highly predictable. Architectural errors intercepted early; syntax errors digested by Shift-Left Validation.	The Sweet Spot. Optimized for high-quality delivery with controlled spend.

🏗️ Architecture: Microkernel OS Philosophy

Core Philosophy

Three Fundamental Problems Solved:

Context Bloat (OOM): LLM blind searching wastes tokens → Solved via pure-text mounted File System and "burn-after-reading".
Requirement Drift (Privilege Escalation): Agent free-play corrupts contracts → Solved via Microkernel Intent Gateway + strict Role Matrix guards.
Knowledge Fragmentation (Memory Leaks): Conversation memory loss → Solved via WAL (Write-Ahead Logging) write-backs and agile Garbage Collection (GC).

🎭 The 13 Virtual Heroes (Role Matrix)

The Agent is not an isolated "full-stack LLM," but a hardcore virtual team of 13 heroes with vastly different personalities. The LLM must dynamically mount these roles and use their exclusive weapons (Python Gate Scripts) to defend system discipline.

🛡️ Phase 1: Explorer (The Fog of War)

@Requirement Engineer: "Do not send me garbage words like 'optimize'. Give me boundaries, or stay quiet!" (Weapon: ambiguity_gate.py)
@Ambiguity Gatekeeper: "Wait, you want to global grep? Draw the focus_card.md red lines first!" (Weapon: focus_card.md Rune)

🏛️ Phase 2 & 3: Propose & Review (Architecture & The Crucible)

@System Architect: "The blast radius is calculated. Build according to my openspec.md blueprint!" (Weapon: Approval Gate Summoning Circle)
@Devil's Advocate: "Oh Architect, do you really think this logic survives high concurrency deadlocks?" (Weapon: cognitive-bias-checklist)

⚔️ Phase 4 & 5: Implement & QA (Coding & Relentless Testing)

@Lead Engineer: "The contract is the law. I only implement openspec.md." (Weapon: javac Furnace of Truth)
@Focus Guard: "Your hands reach too far! Pull back inside the Focus Card ward!" (Weapon: scope_guard.py Ruler of Discipline)
@Code Reviewer: "Magic Numbers? N+1 query risks? Rewrite this filthy code!" (Weapon: Static Linter Light of Purification)

📜 Phase 6: Archive (Memory Persistence)

@Knowledge Extractor (Silent Historian): "Empires fall, but History (WAL) is eternal." (Weapon: writeback_gate.py Judgment of History)
@Documentation Curator (Friend of Humanity): "Show humans some care. Comments must explain Why, not What." (Weapon: README & Javadoc)
@Skill Graph Curator (OCD Librarian): "Once the index is messed up, the whole world loses its way." (Weapon: skill_index_linter.py)

🌌 Background Daemons (Garbage Collection)

@Librarian (Midnight Scavenger): "Shh... Do not wake me unless you bring the @gc command to merge fragments." (Weapon: librarian_gc.py)
@Knowledge Architect (Urban Planner): "This document exceeds 400 lines! LLMs will suffer OOM reading this! Split it!" (Weapon: Structural Reorganization)

System Architecture Diagram

graph TD
    subgraph Input["End-User Input"]
        User["User Request"]
        Shortcut["Fast Syscall: read/patch/standard"]
    end

    subgraph Kernel["Kernel Router - Gateway"]
        IG["Intent Gateway: Parser and Anti-Bias"]
        Risk{"4-Level Risk Matrix"}
        TRIVIAL["TRIVIAL: Fast Path"]
        LOW["LOW: PATCH Track"]
        MEDHIGH["MEDIUM / HIGH: STANDARD Track"]
    end

    subgraph Context["Virtual Memory - Context"]
        DirectRead["Register Read: Explicit Scope"]
        Funnel["Page Table Funnel: Sitemap to Index"]
        Budget["OOM Killer: Wiki<=3, Code<=8"]
    end

    subgraph Knowledge["File System - RAG/Disk"]
        KG["KNOWLEDGE_GRAPH.md: Mount Root"]
        DomainIndex["Partitions: api / data / domain"]
        Archive["Cold Backup: Archive"]
    end

    subgraph Lifecycle["Process Scheduler"]
        LaunchSpec["PCB: Launch Spec"]
        Phase1["1_Explorer: Clarify and Decompose"]
        Phase2["2_Propose: Freeze Contract"]
        Phase3["3_Review: Cognitive Critique"]
        ApprovalGate["Kernel Switch: Approval Gate"]
        Phase4["4_Implement: Atomic Execution"]
        Phase5["5_QA: 3D Testing"]
        Phase6["6_Archive: WAL Write-back and GC"]
    end

    subgraph Roles["Privilege Rings - Ring 0/3"]
        SysArchitect["System Architect: Arch Auth"]
        FocusGuard["Focus Guard: Segfault Guard"]
        DocCurator["Doc Curator: FS Write Auth"]
    end

    Input --> IG
    IG --> Risk
    Risk -->|TRIVIAL| TRIVIAL
    Risk -->|LOW| LOW
    Risk -->|MEDIUM/HIGH| MEDHIGH

    TRIVIAL --> DirectRead
    LOW --> LaunchSpec
    MEDHIGH --> LaunchSpec

    DirectRead --> Funnel
    Funnel --> Budget

    LaunchSpec --> Phase1
    Phase1 --> Phase2
    Phase2 --> Phase3
    Phase3 --> ApprovalGate
    ApprovalGate --> Phase4
    Phase4 --> Phase5
    Phase5 --> Phase6
    Phase6 --> LaunchSpec

🚦 Core Workflow: Dual-Track & 4-Level Risk Matrix

No more one-size-fits-all red tape. The framework assesses tasks at the Kernel entry point (Router) and routes them to different tracks:

4-Level Risk Matrix

Risk Level	Characteristics	Authorization	Testing	Rollback Cost
TRIVIAL	Queries, logging, typos, reading	Auto-Approve	Optional	Zero
LOW	Single-method bugfix, internal refactor (no API/DB change)	Implicit (PATCH)	Unit Tests	Very Low
MEDIUM	New APIs, DB column additions, cross-module calls	Explicit (Approval Gate)	Integration	High
HIGH	Core flow modifications, state machine changes	Explicit + Arch Review	Full Regression	Disastrous

Dual-Track Flow

1. PATCH Track (For TRIVIAL & LOW)

The Fast Path. Zero bureaucracy.

Skips the lengthy Propose and Review phases.
No heavy openspec.md generated; uses a lightweight focus_card.md.
Straight to implementation and testing.
Extremely low token cost, ideal for high-frequency, small iterations.

2. STANDARD Track (For MEDIUM & HIGH)

Heavy Armor. Defending the engineering baseline.

Strictly follows the full 6-phase lifecycle (Explorer → Propose → Review → Implement → QA → Archive).
Mandates the generation of openspec.md and triggers the Approval Gate (Human-in-the-loop) before writing any code.
Injects cognitive critique to interrogate the architectural design.

🔧 Skill Ecosystem

The skill ecosystem is organized across multiple categories, each mounted by lifecycle phase and role requirements. Skills are stored under .agents/skills/ with trae-skill-index serving as the global routing table.

Default Enabled Skills (Auto-Invoked)

Skill	Phase	Primary Role
`brainstorming`	Explorer / Propose	Requirement Engineer
`task-decomposition-guide`	Propose / Review	System Architect
`writing-plans`	Propose / Implement	System Architect / Lead Engineer
`systematic-debugging`	Implement / QA	Lead Engineer / Code Reviewer
`test-driven-development`	Implement	Lead Engineer
`verify`	QA / Archive	Code Reviewer / Knowledge Extractor
`code-review-checklist`	QA	Code Reviewer
`wal-documentation-rules`	Archive	Knowledge Extractor
`skill-graph-manager`	Any (skills change)	Skill Graph Curator
`java-architecture-standards`	Propose / Implement	System Architect / Lead Engineer
`java-coding-style`	Implement	Lead Engineer
`java-testing-standards`	QA	Code Reviewer
`mybatis-sql-standard`	Propose / Implement	System Architect / Lead Engineer

Role-Required Skills (Explicitly Mounted)

Skill	Required By
`cognitive-bias-checklist`	Requirement Engineer, System Architect, Devil's Advocate
`spec-quality-checklist`	Requirement Engineer, System Architect, Documentation Curator
`decision-frameworks`	System Architect, Devil's Advocate, Ambiguity Gatekeeper
`linter-severity-standard`	Code Reviewer

Engineering Pipeline Skills

ai-pipeline, blueprint, architecture-decision-records, eval-harness, external-research, self-improve, ai-slop-cleaner

Workflow & Collaboration Skills

dispatching-parallel-agents, using-git-worktrees, release, deepinit, remember

🚀 Quick Start

3-Minute Onboarding Guide

Step 1: Read the "Constitution" ⚡

Start with AGENTS.md - the master entry point defining execution discipline and OS mounting rules.

OOM Killer: Wiki ≤ 3 docs, Code ≤ 8 files (Trigger Escalation if exceeded).
Cognitive Brake: Mandatory XML block before any action to enforce Process, Scope, Budget, and bias reflection.

Step 2: Make a System Call (Shortcuts DSL) 🎯

Use explicit commands to force the framework into a specific track:

@read / @learn     → Enter Read-Only Process (TRIVIAL, no side effects)
@patch / @quickfix → Mount PATCH Track (LOW, lightweight fix)
@standard          → Mount STANDARD Track (MEDIUM/HIGH, full heavy lifecycle)

Example:

@learn --scope src/foo/bar.ts -- explain this file
@patch --risk low --test "mvn test" -- fix NPE in createOrder
@standard --risk high -- implement tenant permission checks for order list API

Step 3: Understand Breakpoint Resume 🔄

Launch Spec is persisted at router/runs/launch_spec_*.md (Acts as the PCB - Process Control Block).
First action after session interruption: read this file to restore state.
If stuck at WAITING_APPROVAL, the Agent will wait for you to review openspec.md and say "Approved" before switching to kernel mode to write code.

🛡️ Self-Correction & Gating Mechanisms

Mechanism	Trigger Point	Effect	OS Metaphor
Cognitive_Brake	Before any action	Forces LLM reasoning (roles, boundaries, bias reflection)	Kernel Privilege Check
pre_hook	Before new phase	Load rules + output preflight	Context Switch
guard_hook	During code edit	Blocks style/auth violations immediately; runs `secrets_linter.py`	Memory Segfault Guard
Approval Gate	After Review	Freezes contract, waits for human	User-to-Kernel Switch
shift_left_hook	After code written	Forces autonomous compile check (`javac` / `mvn compile`); max 2 retries	Build Sanity Check
Archive Write-back	Task end	Appends stable specs to Wiki Index (WAL)	fsync (Dirty Page Write)

📖 Related Documentation

📘 Engineering Manual (Chinese): ENGINEERING_MANUAL_zh.md
📘 Engineering Manual (English): ENGINEERING_MANUAL.md
🇨🇳 Chinese README: README_zh.md
📌 Project Rules: AGENTS.md - Master rule entry & constitution
🗺️ Knowledge Graph: .agents/llm_wiki/KNOWLEDGE_GRAPH.md - Virtual FS Root

🤝 Contributing

Welcome to co-build this pure M2M engineering infrastructure!

Read First: Deeply understand the Microkernel philosophy in ENGINEERING_MANUAL.md.
Follow Lifecycle: Architectural changes must go through the STANDARD track.
Restraint: We pursue high density and orthogonality in skills. Refuse adding "spaghetti" single-instruction skills.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with ❤️ for sustainable, non-bloating backend development

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.agents		.agents
.claude		.claude
static		static
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ENGINEERING_MANUAL.md		ENGINEERING_MANUAL.md
ENGINEERING_MANUAL_zh.md		ENGINEERING_MANUAL_zh.md
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md

Folders and files

Latest commit

History

Repository files navigation