-
Notifications
You must be signed in to change notification settings - Fork 223
PRD: Squad State — Typed StorageProvider Interface with Contract Conformance #481
Description
Squad State: Typed Interface + MarkdownProvider
Author: Dina Berry
Date: 2026-03-22
Status: Draft — Awaiting Brady's Review
Target Repository: bradygaster/squad (packages/squad-sdk/)
Related: Full research document in PRD-squad-state.md
1. Problem Statement
The Squad SDK has no centralized state management. Every module that reads or writes .squad/ files does it independently — 129 read ops and 87 write ops scattered across 29 modules, all direct fs calls with no shared code, no abstraction, and no type safety.
What's broken:
- 6 separate directory scans for the same
.squad/tree (LocalAgentSource,CharterCompiler,AgentLifecycleManager(stale path!),nap.ts,doctor.ts,export.ts(wrong path!)) - Parsers are read-only — markdown → typed objects works, but no serializers exist for the reverse
- CLI bypasses SDK for writes — raw string concatenation instead of typed APIs
- Stringly-typed everywhere — agent status duplicated 3×, routing tiers duplicated 4×, model tiers duplicated 3×,
WorkTypehas a| stringescape hatch that defeats the union
Why now: The team has converged through three research sessions (2026-03-20 through 2026-03-22). The architecture is validated. Time to build.
Why a StorageProvider interface (two independent arguments)
POV #1 — Backend Flexibility: The interface enables swapping storage backends (markdown, SQLite, GitHub API) without changing SDK code. Markdown is the default; others become mechanical implementations against the same contract.
POV #2 — Contract Conformance: The interface defines a testable contract. A provider-agnostic conformance test suite validates that implementations behave correctly — catching bugs in the existing markdown provider (round-trip data loss, edge cases) even if no second backend is ever built. Value on day one.
POV #2 is arguably stronger: it delivers value immediately without requiring faith that a second backend will be needed.
Scope of this PRD
This PRD covers: The StorageProvider typed interface and the MarkdownStorageProvider implementation (plus InMemoryStorageProvider for tests).
This PRD does NOT cover: Implementation details for SQLite, GitHub API, or other non-filesystem providers. The interface is designed so they CAN be added later — that's a separate future PRD.
2. Goals & Success Metrics
Goals
- Define the
StorageProviderinterface — typed, async, collection-level CRUD with section-level operations - Ship
MarkdownStorageProvider— default implementation preserving all current.squad/behavior - Ship
InMemoryStorageProvider— for fast, deterministic, filesystem-free tests - Build
SquadStateas the typed facade — one entry point, typed collections, Agent Handle pattern, noascasts - Establish conformance test suite — provider-agnostic tests that validate any implementation
Success Metrics
| Metric | Target |
|---|---|
Direct fs calls in SDK |
0 outside MarkdownStorageProvider |
| Directory scan locations | 1 (internal to MarkdownStorageProvider) |
| SDK test execution time | <2s for state tests (InMemory) |
| Round-trip fidelity | 100% for all document types |
| CLI write operations | 0 raw string surgery |
Non-Goals
- Build non-filesystem providers (SQLite, GitHub API, etc.)
- Replace git as the transport/sync mechanism
- Abstract the communication layer (issues/PRs)
- Break existing
.squad/directory structure or formats
3. Key User Scenarios
Scenario 1: Read agent charter
// BEFORE: bespoke fs + parsing per module
const raw = fs.readFileSync(path.join(squadRoot, '.squad', 'agents', name, 'charter.md'), 'utf-8');
const parsed = parseCharter(raw);
// AFTER: typed handle — plain string in, typed API out
const mal = await state.agents.get('mal'); // AgentHandle — validates at runtime, throws NotFoundError
const charter = await mal.charter(); // Promise<Charter> — fully typed, no casts
const learnings = await mal.history('learnings'); // Promise<HistoryEntry[]>
await mal.appendHistory('learnings', entry); // Promise<void>Scenario 2: CLI appends a decision
// BEFORE: raw string concatenation
const entry = `### ${timestamp}: ${title}\n**By:** ${author}\n\n${body}\n`;
fs.writeFileSync(decisionsPath, existing + '\n' + entry);
// AFTER: typed API handles serialization
await state.decisions.addDecision({ title, by: 'Mal (Lead)', body, timestamp: new Date() });Scenario 3: Unit testing without filesystem
const store = new InMemoryStorageProvider();
store.seed({
agents: {
'test-agent': {
charter: { name: 'test-agent', role: 'Tester', status: 'active' },
},
},
});
const state = new SquadState(store); // fast, deterministic, no cleanup
const agent = await state.agents.get('test-agent'); // AgentHandle — typed, no casts4. Scope
In Scope
StorageProviderinterface definition (typed async CRUD at document + section level)MarkdownStorageProvider— wraps current filesystem operations, owns markdown serializationInMemoryStorageProvider— for testsSquadStateclass — typed facade with collection-specific sub-interfaces- Shared
types.ts— single source of truth forAgentName,AgentStatus,HistorySection,ModelTier,RoutingTier,CollectionName, domain types - Markdown serializers for all document types (charters, decisions, routing, team, history)
- Conformance test suite — provider-agnostic, validates any
StorageProviderimplementation - Absorption of
history-shadow.tsintoSquadStateasagents.historycollection
Out of Scope
- Non-filesystem provider implementations (SQLite, GitHub API, JSON — interface supports them, future PRD)
- Communication layer abstraction (issues/PRs — separate concern)
- Config schema unification (
SquadConfigvsSquadSDKConfig— adjacent work) - Breaking changes to
.squad/format
5. Approach
Architecture
SquadState (typed facade — typed collections, AgentHandle pattern, domain objects)
└── StorageProvider (interface — typed async CRUD, collection→entity mapping)
├── MarkdownStorageProvider (default — reads/writes .squad/ files)
│ └── SquadFileSystem (file discovery, path resolution, caching)
└── InMemoryStorageProvider (for tests)
Key Design Decisions
1. StorageProvider Interface — Type-Safe Collection-Entity Mapping
The generic parameters on read() and write() are linked to the collection name via CollectionEntityMap. The compiler prevents reading a Charter from the 'decisions' collection — a class of bug that unconstrained generics silently permit.
// Maps each collection to its entity type — the compiler enforces correct pairings
interface CollectionEntityMap {
agents: Charter;
decisions: Decision;
routing: RoutingConfig;
team: TeamConfig;
skills: SkillDefinition;
templates: Template;
log: LogEntry;
config: SquadConfig;
}
type CollectionName = keyof CollectionEntityMap;
type CollectionEntity = CollectionEntityMap[CollectionName];
interface StorageProvider {
// Collection-level
list(collection: CollectionName): Promise<string[]>;
// Whole-document operations — collection name constrains entity type
read<C extends CollectionName>(collection: C, id: string): Promise<CollectionEntityMap[C]>;
write<C extends CollectionName>(collection: C, id: string, entity: CollectionEntityMap[C]): Promise<void>;
exists(collection: CollectionName, id: string): Promise<boolean>;
delete(collection: CollectionName, id: string): Promise<void>;
// Section-level operations (sections are string-typed at this layer;
// SquadState facade adds typed section names per collection)
readSection(collection: CollectionName, id: string, section: string): Promise<unknown>;
appendToSection(collection: CollectionName, id: string, section: string, entry: SectionEntry): Promise<void>;
// Lifecycle
initialize(): Promise<void>;
dispose(): Promise<void>;
}Design note (River): The old
read<T extends CollectionEntity>()signature let callers request any entity type from any collection —read<Charter>('decisions', 'foo')compiled but was always wrong. TheCollectionEntityMappattern (same one the SDK uses forSquadEventPayloadMap) makes invalid states unrepresentable.
2. SquadState — Agent Handle Pattern (No Branded Types, No Casts)
The Agent Handle pattern eliminates branded types entirely. The old AgentName = string & { __brand: 'AgentName' } forced every call site into an as AgentName cast — a type system smell that means the design is wrong, not the caller. Instead, state.agents.get() accepts a plain string, validates at runtime, and returns a typed AgentHandle. Everything downstream is fully typed without casts.
type AgentStatus = 'active' | 'inactive' | 'retired';
type HistorySection = 'context' | 'learnings' | 'decisions' | 'patterns' | 'issues' | 'references';
type ModelTier = 'premium' | 'standard' | 'fast';
type RoutingTier = 'direct' | 'lightweight' | 'standard' | 'full';
// The handle is the typed API surface for a single agent
interface AgentHandle {
readonly name: string;
charter(): Promise<Charter>;
history(): Promise<History>;
history(section: HistorySection): Promise<HistoryEntry[]>;
appendHistory(section: HistorySection, entry: HistoryEntry): Promise<void>;
status(): Promise<AgentStatus>;
}
// The collection manages discovery and handle creation
interface AgentCollection {
list(): Promise<string[]>;
get(name: string): Promise<AgentHandle>; // validates name, throws NotFoundError
exists(name: string): Promise<boolean>;
}
interface DecisionCollection {
list(): Promise<Decision[]>;
add(decision: Decision): Promise<void>;
}
class SquadState {
readonly agents: AgentCollection;
readonly team: TeamCollection;
readonly routing: RoutingCollection;
readonly decisions: DecisionCollection;
readonly skills: SkillCollection;
constructor(provider: StorageProvider) { /* wires typed collections to provider */ }
}Why no branded AgentName: The SDK's existing
AgentRefisstring. The codebase uses plain strings for agent names everywhere. A branded type adds friction (casts at every boundary) without adding safety — the real validation is "does this agent exist?" which is a runtime concern handled byAgentHandle. The handle IS the proof of validity.
3. Serialization = Provider's Responsibility
The interface deals in typed objects only. MarkdownProvider serializes to/from markdown. A future SQLiteProvider would map to/from SQL rows. The caller never sees raw strings or format-specific content.
4. Granularity = File + Section Level
Operations work at two levels: whole-document (get/put a Charter) and section (append to History.Learnings). This maps to how .squad/ files use ## headers as logical sections. Non-filesystem backends can map sections to table columns, API endpoints, or document subsections.
5. history-shadow.ts Absorbed Into SquadState
The existing history-shadow.ts proved the pattern (typed CRUD, section-level operations, markdown as storage). It does NOT remain standalone — it becomes state.agents.history. One interface, no parallel APIs.
6. Inbox/Drop-Box = Provider-Specific
The decisions inbox pattern (write to inbox files, Scribe merges) is a MarkdownProvider implementation detail, NOT part of the StorageProvider interface. The interface exposes addDecision(decision: Decision): Promise<void>. How the provider handles concurrent writes is its own concern.
7. Async-Only
All StorageProvider methods return Promises. No sync variant. JavaScript is async-first, and the interface must support remote backends.
Conformance Test Suite (Phase 0 Foundation)
The conformance suite is the specification made executable. Written as function runConformanceSuite(provider: StorageProvider), invoked once per implementation.
Core behaviors tested:
write()→read()returns identical typed object (round-trip)write()→exists()returnstrue; beforewrite()→ returnsfalselist()returns all written documents, no extrasappendToSection()preserves existing content, adds new contentdelete()→exists()returnsfalse;read()throwswrite()twice →read()returns second write (overwrite semantics)- Empty collection →
list()returns[], not error initialize()→dispose()lifecycle is clean
Why this matters: If both InMemoryStorageProvider and MarkdownStorageProvider pass the same suite, the contract is verified from two independent implementations. Neither can cheat in a way the other also cheats.
Competitive Analysis (Brief)
| Pattern | LangGraph | CrewAI | AutoGen | Semantic Kernel | Squad |
|---|---|---|---|---|---|
| Pluggable storage interface | ✅ | ❌ | Partial | ✅ | ✅ |
| Human-readable state | ❌ | ❌ | ❌ | ❌ | ✅ |
| Git-diffable state | ❌ | ❌ | ❌ | ❌ | ✅ |
| Document-level operations | ❌ | ❌ | ❌ | ❌ | ✅ |
| Contract conformance tests | ❌ | ❌ | ❌ | ❌ | ✅ |
| State is the product | ❌ | ❌ | ❌ | ❌ | ✅ |
Squad's unique position: Every other framework stores opaque blobs or embeddings. Squad stores semantically rich, human-editable documents. The closest analog is an ORM, not a checkpoint system.
Implementation Phases
Phase 0: Conformance Suite + Tests (1 week)
- Round-trip fidelity tests for existing parsers
- Conformance test suite (
runConformanceSuite(provider)) alongside interface - Run against
InMemoryStorageProvideras smoke test - Deliverable: Living spec + round-trip tests (red — serializers don't exist yet)
Phase 1: Interface + Two Providers (2 weeks)
StorageProviderinterface + sharedtypes.ts(single source of truth)MarkdownStorageProviderwithSquadFileSysteminternalsInMemoryStorageProvider- Both pass conformance suite
- Deliverable: Interface, two providers, all tests green
Phase 2: SquadState Facade (2 weeks)
- Typed facade with collection-specific sub-interfaces
- Wire into existing SDK modules (replace scattered
fscalls) - Deliverable:
SquadStateclass, SDK modules refactored
Phase 3: CLI Migration (1–2 weeks)
- Replace all CLI raw string surgery with
SquadStateAPIs - Deliverable: CLI fully migrated, zero raw string surgery
6. Risks
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Interface shape is wrong | Medium | High | Let MarkdownProvider drive the shape; two implementations validate. Refine before shipping. |
| Round-trip fidelity is harder than expected | High | Medium | Phase 0 catches it early. Known hard cases: YAML frontmatter, inline HTML, comments. |
| Migration breaks CLI behavior | Medium | High | Conformance suite + incremental migration (one module at a time). |
| Scope creep into non-filesystem providers | Medium | Medium | Hard boundary: this PRD covers interface + MarkdownProvider only. Other providers = separate PRD. |
Dependencies
- Node.js 18+ (already met), existing parsers in
markdown-migration.tsanddoc-sync.ts, TypeScript strict mode (enabled), no new npm packages
Edge Cases (Interface-Level)
These are contracts the interface defines; implementation details are provider-specific:
| Edge Case | Interface Contract |
|---|---|
| Concurrent appends | appendToSection() is the provider's responsibility. MarkdownProvider uses atomic file ops; others use transactions. |
| Atomicity | write() is all-or-nothing. A failed write must not leave partial state. |
| Error types | Discriminated error union — see Error Taxonomy below. |
| Change detection | Not in v1 interface. A future subscribe(collection, callback) is a natural extension. |
| Human editability | supportsExternalModification capability flag. Filesystem providers support it; closed providers can cache aggressively. |
| Ordering | Append-only collections carry timestamps. list() returns chronological order. |
Error Taxonomy
Errors use a kind discriminant for exhaustive switch matching, following the SDK's existing discriminated union patterns (see CoordinatorRoutingPayload, AgentMilestonePayload). Each error extends a base StorageError class:
type StorageErrorKind = 'not-found' | 'parse-error' | 'write-conflict' | 'provider-error';
class StorageError extends Error {
abstract readonly kind: StorageErrorKind;
}
class NotFoundError extends StorageError {
readonly kind = 'not-found' as const;
constructor(
readonly collection: CollectionName,
readonly id: string,
) { super(`${collection}/${id} not found`); }
}
class ParseError extends StorageError {
readonly kind = 'parse-error' as const;
constructor(
readonly collection: CollectionName,
readonly id: string,
readonly cause: Error,
) { super(`Failed to parse ${collection}/${id}`); }
}
class WriteConflictError extends StorageError {
readonly kind = 'write-conflict' as const;
constructor(
readonly collection: CollectionName,
readonly id: string,
) { super(`Write conflict on ${collection}/${id}`); }
}
class ProviderError extends StorageError {
readonly kind = 'provider-error' as const;
constructor(
message: string,
readonly cause?: Error,
) { super(message); }
}
// Exhaustive handling — compiler catches missing cases
function handleError(err: StorageError): never {
switch (err.kind) {
case 'not-found': throw new UserFacingError('Resource not found');
case 'parse-error': throw new UserFacingError('Corrupt data');
case 'write-conflict': /* retry logic */
case 'provider-error': /* escalate */
}
}Design note (River): Classes + discriminant is the right hybrid here. Classes give you
instanceofchecks and.message/.stackfor logging. Thekinddiscriminant gives you exhaustiveswitchmatching. This aligns with the SDK's existingErrorFactorypattern while adding the exhaustiveness guarantee that catch blocks currently lack.
7. Architecture Review — River (TypeScript Architect)
Interface Shape Assessment
CollectionEntityMap is the critical addition. The original read<T extends CollectionEntity>() had an unconstrained generic — the T bore no relationship to the collection parameter. This is the TypeScript equivalent of void*: it compiles, it's wrong, and you won't know until runtime. The mapped type CollectionEntityMap makes the compiler enforce valid collection-entity pairings.
AgentHandle eliminates the branded type cascade. Branded types (string & { __brand: 'AgentName' }) are the right tool when you need to distinguish two string-shaped values at the type level (e.g., UserId vs SessionId in the same function signature). They're the wrong tool when there's only one string type and the real validation is "does this thing exist?" — that's a runtime concern. The Agent Handle pattern moves validation to get() and makes the returned handle the proof of validity. Everything downstream is typed without casts.
The readSection return type is intentionally unknown at the StorageProvider layer. Section typing is the SquadState facade's job — it knows that agents have HistorySection subsections. The provider just stores and retrieves opaque section data. This keeps the provider interface simple and pushes domain knowledge to the right layer.
Generic Patterns
Well-used: CollectionEntityMap[C] — conditional type inference from mapped types. Same pattern as the SDK's SquadEventPayloadMap.
Recommendation — overloaded history() on AgentHandle: The history() method has two signatures (full history vs. single section). TypeScript function overloads handle this cleanly:
interface AgentHandle {
history(): Promise<History>;
history(section: HistorySection): Promise<HistoryEntry[]>;
}This is better than two separate methods (getHistory / getHistorySection) because the mental model is "history, optionally scoped."
Where to Use Discriminated Unions
The PRD already uses literal union types (AgentStatus, HistorySection, ModelTier, RoutingTier). These are correct — they're simple enumerations, not variants with different shapes.
Where discriminated unions add value:
- Error taxonomy — Done above with
StorageErrorKind. - Provider capabilities — If providers need to declare what they support:
type ProviderCapability = | { readonly kind: 'external-modification'; supported: true } | { readonly kind: 'transactions'; supported: boolean; maxBatchSize?: number } | { readonly kind: 'watch'; supported: boolean };
- Future: operation results — If
write()needs to return more thanvoid(e.g., created vs. updated), a discriminated result type is cleaner than boolean flags.
Conformance Testing — Type-Level Enforcement
The conformance suite can use TypeScript's type system to guarantee completeness:
// Compile-time: verify a class implements StorageProvider
type AssertProvider<T extends StorageProvider> = T;
type _checkMarkdown = AssertProvider<MarkdownStorageProvider>; // compile error if missing methods
type _checkInMemory = AssertProvider<InMemoryStorageProvider>;
// Runtime: the conformance suite is a function, not a class
function runConformanceSuite(
name: string,
factory: () => Promise<StorageProvider>, // factory, not instance — fresh state per test
): void {
describe(`StorageProvider conformance: ${name}`, () => {
// Round-trip tests for every collection in CollectionEntityMap
for (const collection of COLLECTION_NAMES) {
it(`round-trips ${collection}`, async () => {
const provider = await factory();
// ... write, read, assert deep equality
});
}
});
}
// Invocation:
runConformanceSuite('MarkdownStorageProvider', () => MarkdownStorageProvider.create(tempDir));
runConformanceSuite('InMemoryStorageProvider', () => Promise.resolve(new InMemoryStorageProvider()));Key insight: Use a factory function, not an instance. Each test gets a fresh provider. This prevents test pollution and mirrors real initialization.
DX Critique
What I'd enjoy:
state.agents.get('mal')→ handle pattern is excellent DX. Autocomplete works. No ceremony.await mal.history('learnings')— overloaded method, scoped reads. Clean.state.decisions.add(...)— simple, obvious.
What I'd change:
appendToSection(collection, id, section, entry)at the StorageProvider level takes 4 positional string args. Consider a params object for clarity when more than 3 args:appendToSection(params: { collection: CollectionName; id: string; section: string; entry: SectionEntry }): Promise<void>;
- The
initialize()/dispose()lifecycle should be hidden from callers. SquadState's constructor should callinitialize(). Exposedispose()only for cleanup. ConsiderSymbol.asyncDisposeforusingsyntax:await using state = await SquadState.create(provider); // automatically disposed when scope exits
Missing Type Patterns
-
Template literal types for paths: If the provider ever needs to express file paths:
type AgentPath = `.squad/agents/${string}/charter.md`; type HistoryPath = `.squad/agents/${string}/history.md`;
Not critical for v1, but useful if paths leak into the public API.
-
satisfiesoveras constfor seed data: The test fixtures should usesatisfiesto validate structure while preserving literal types:const seed = { agents: { 'test-agent': { charter: { name: 'test-agent', role: 'Tester' } } }, } satisfies SeedData;
-
NoInfer<T>for provider methods: If a provider method accepts both a value and a type hint,NoInferprevents the value from widening the inferred type. Not needed in v1 but worth noting for future generic methods.