Changelog

All notable changes to GoClaw are documented here. For full documentation, see docs.goclaw.sh.

Unreleased

Breaking Changes

Context pruning now opt-in. Previously tool-result trimming ran by default for all providers; now requires explicit contextPruning.mode: "cache-ttl" in config.agents.defaults to enable. Matches upstream TS design and prevents silent prompt-cache invalidation on Anthropic.

Migration — add to config.json5:
```
agents: {
  defaults: {
    contextPruning: { mode: "cache-ttl" }
  }
}
```

New Features

Pancake private-reply (comment → DM). Enables a one-time DM to commenters after the public reply. Stateless on GoClaw side — no DB dedup table, no in-memory state:
- Config: features.private_reply (bool) + private_reply_message (text).
- Template variables {{commenter_name}} and {{post_title}} with literal-replace semantics (pre-sanitizes {{/}} from var values to prevent var-in-var substitution).
- Empty private_reply_message → English fallback constant.
- Dedup strategy: webhook-level comment_id dedup (already in comment_handler.go) + Facebook's per-comment idempotent private_replies endpoint handle duplicates platform-side. No GoClaw state required.
- No DB migration.

Improvements

Context pruning cleanup. Removed redundant Pass 0 (per-result 30% guard), deduplicated double prune call per iteration, added SanitizeHistory to PruneStage for broken tool_use/tool_result pair cleanup.
Context pruning config backfill (migration). Agents with existing custom context_pruning config (e.g., softTrimRatio, keepLastAssistants) but missing a mode field get auto-backfilled with mode: "cache-ttl" to preserve their intent after the opt-in flip. Rows with NULL config stay NULL (new opt-in default applies). PG migration 51; SQLite schema v19.
Pancake channel metadata routing. Whitelist in internal/channels/routing_metadata.go now preserves post_id and display_name across the inbound → outbound hop so the private-reply template variables survive the agent pipeline round-trip.

Project Status

Implemented & Tested in Production

Agent management & configuration — Create, update, delete agents via API and web dashboard. Agent types (open / predefined), agent routing, and lazy resolution all tested.
Telegram channel — Full integration tested: message handling, streaming responses, rich formatting (HTML, tables, code blocks), reactions, media, chunked long messages.
Seed data & bootstrapping — Auto-onboard, DB seeding, migration pipeline tested end-to-end.
User-scope & content files — Per-user context files (user_context_files), agent-level context files (agent_context_files), virtual FS interceptors, per-user seeding (SeedUserFiles), and user-agent profile tracking all implemented and tested.
Core built-in tools — File system tools (read_file, write_file, edit_file, list_files, search, glob), shell execution (exec), web tools (web_search, web_fetch), and session management tools tested in real agent loops.
Memory system — Long-term memory with pgvector hybrid search (FTS + vector) implemented and tested with real conversations.
Agent loop — Think-act-observe cycle, tool use, session history, auto-summarization, and subagent spawning tested in production.
WebSocket RPC protocol (v3) — Connect handshake, chat streaming, event push all tested with web dashboard and integration tests.
Store layer (PostgreSQL) — All PG stores (sessions, agents, providers, skills, cron, pairing, tracing, memory, teams) implemented and running.
Browser automation — Rod/CDP integration for headless Chrome, tested in production agent workflows.
Lane-based scheduler — Main/subagent/team/cron lane isolation with concurrent execution tested. Group chats support up to 3 concurrent agent runs per session with adaptive throttle and deferred session writes for history isolation.
Security hardening — Rate limiting, prompt injection detection, CORS, shell deny patterns, SSRF protection, credential scrubbing all implemented and verified.
Web dashboard — Channel management, agent management, pairing approval, traces & spans viewer, skills, MCP, cron, sessions, teams, and config pages all implemented and working.
Prompt caching — Anthropic (explicit cache_control), OpenAI/MiniMax/OpenRouter (automatic). Cache metrics tracked in trace spans and displayed in web dashboard.
Agent delegation — Inter-agent task delegation with permission links, sync/async modes, per-user restrictions, concurrency limits, and hybrid agent search. Tested in production.
Agent teams — Team creation with lead/member roles, shared task board (create, claim, complete, search, blocked_by dependencies), team mailbox (send, broadcast, read). Tested in production.
Evaluate loop — Generator-evaluator feedback cycles with configurable max rounds and pass criteria. Tested in production.
Delegation history — Queryable audit trail of inter-agent delegations. Tested in production.
Skill system — BM25 search, ZIP upload, SKILL.md parsing, and embedding hybrid search. Tested in production.
MCP integration — stdio, SSE, and streamable-http transports with per-agent/per-user grants. Tested in production.
Cron scheduling — at, every, and cron expression scheduling. Tested in production.
Docker sandbox — Isolated code execution in containers. Tested in production.
Text-to-Speech — OpenAI, ElevenLabs, Edge, MiniMax providers. Tested in production.
HTTP API — /v1/chat/completions, /v1/agents, /v1/skills, etc. Tested in production. Interactive Swagger UI at /docs.
API key management — Multi-key auth with RBAC scopes, SHA-256 hashed storage, show-once pattern, optional expiry, revocation. HTTP + WebSocket CRUD. Web UI for management.
Hooks system — Event-driven hooks with command evaluators (shell exit code) and agent evaluators (delegate to reviewer). Blocking gates with auto-retry and recursion-safe evaluation.
Media tools — create_image (DashScope, MiniMax), create_audio (OpenAI, ElevenLabs, MiniMax, Suno), create_video (MiniMax, Veo), read_document (Gemini File API), read_image, read_audio, read_video. Persistent media storage with lazy-loaded MediaRef.
Additional provider modes — Claude CLI (Anthropic via stdio + MCP bridge), Codex (OpenAI gpt-5.3-codex via OAuth).
Knowledge graph — LLM-powered entity extraction, graph traversal, force-directed visualization, and knowledge_graph_search agent tool.
Memory management — Admin dashboard for memory documents (CRUD, semantic search, chunk/embedding details, bulk re-indexing).
Persistent pending messages — Channel messages persisted to PostgreSQL with auto-compaction (LLM summarization) and monitoring dashboard.
Heartbeat system — Periodic agent check-ins via HEARTBEAT.md checklists with suppress-on-OK, active hours, retry logic, and channel delivery.

Implemented but Not Fully Tested

Slack — Channel integration implemented, not yet validated with real users.
Other messaging channels — Discord, Zalo OA, Zalo Personal, Feishu/Lark, WhatsApp channel adapters are implemented but have not been tested end-to-end in production. Only Telegram has been validated with real users.
OpenTelemetry export — OTLP gRPC/HTTP exporter implemented (build-tag gated). In-app tracing works; external OTel export not validated in production.
Tailscale integration — tsnet listener implemented (build-tag gated). Not tested in a real deployment.
Redis cache — Optional distributed cache backend (build-tag gated). Not tested in production.
Browser pairing — Pairing code flow implemented with CLI and web UI approval. Basic flow tested but not validated at scale.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changelog

Unreleased

Breaking Changes

New Features

Improvements

Project Status

Implemented & Tested in Production

Implemented but Not Fully Tested

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

Unreleased

Breaking Changes

New Features

Improvements

Project Status

Implemented & Tested in Production

Implemented but Not Fully Tested