PRD: Versioned file layout manifest - single source of truth for directory structure

# PRD: Versioned File Layout Manifest — Single Source of Truth for Directory Structure

**Status:** Approved (Flight)  
**Requested by:** Dina Berry  
**Last Updated:** 2026-03-28

---

## Executive Summary

Squad's CLI has five independent code paths that hardcode file locations with no centralized source of truth. This inconsistency has already caused two confirmed bugs (skills discovery #77, squad.agent.md silent deletion #730) and will continue degrading reliability. We propose a versioned JSON manifest that defines every directory Squad manages, its canonical location per version, read priority order, write target, and deprecated paths. All five code paths will reference this manifest instead of hardcoding paths independently.

---

## Problem Statement

### Root Cause: Five Independent Code Paths Without Single Source of Truth

Squad manages 33+ template files and dozens of directories, but has no centralized registry for file layout across versions. Instead, five separate code paths each make independent assumptions about where files live:

1. **`init.ts`** — Creates directories at init time; hardcodes paths like `.squad/`, `.copilot/agents/`, `.github/workflows/`
2. **`migrations.ts`** — Moves files between locations on upgrade; implements custom logic for each version transition
3. **`upgrade.ts`** — Syncs template files to target directories; references `TEMPLATE_MANIFEST` but lacks runtime read-order logic
4. **`squad.agent.md`** — Tells agents where to read/write skills, decisions, and history; contains hardcoded paths that drift from runtime behavior
5. **`skill-source.ts`, `resolver.ts`** — Runtime reads from hardcoded paths using either/or logic (checks one location, stops on first match)

### Evidence: Two Independent Bugs with Same Root Cause

#### Bug 1: Skills Discovery (#77)
- **Symptom:** When users upgrade Squad, the migration moves skills from `.squad/skills/` to `.copilot/skills/`, but runtime still only checks one location
- **Impact:** Skills become unavailable after upgrade; users see "skill not found" errors
- **Root cause:** `skill-source.ts` and `resolver.ts` use either/or pattern (returns first match only), while migration didn't update both paths consistently
- **Related work:** PR #669 proposed a fix (merge both directories), but wasn't merged upstream

#### Bug 2: squad.agent.md Silent Deletion (#730)
- **Symptom:** After `squad upgrade`, the file `.github/agents/squad.agent.md` silently disappears, breaking Copilot's ability to discover Squad as an agent
- **Three failing code paths** (PR #731):
  - **`upgrade.ts:496-504`** — Template source existence check with no else clause; silent skip if missing
  - **`init.ts:257-261`** — SDK init stamps version but silently skips if file wasn't created
  - **`doctor.ts:~392`** — Detects empty file but reports `warn` instead of `fail`
- **Impact:** Squad stops working on any machine that pulls the repo; no error, no warning, no diagnostic trail
- **Root cause:** Identical to #77 — hardcoded paths with no validation, silent skips instead of errors

### Systemic Pattern: Silent Degradation

These bugs share a common pattern:
- Write operation is guarded by a source-existence check **with no else clause**
- Health check downgrades severity below actual impact
- No post-operation validation confirms the expected outcome

The `TEMPLATE_MANIFEST` defines 33 template files, ~23 marked `overwriteOnUpgrade: true`. **Any of these could silently disappear through the same pattern.**

### Customer Impact

- **Every squad init repo is affected** by this gap
- Skills discovery failures happen when migrations and runtime path resolution drift
- On upgrade, either/or logic in resolver.ts can miss files entirely
- Version-to-version maintenance is error-prone and requires manual audits of 5 independent code paths
- Silent degradation means bugs go undetected until users report broken functionality

---

## Proposed Solution: Versioned File Layout Manifest

### Core Concept

A single versioned JSON manifest that defines:
- **Every directory Squad manages** and its purpose
- **Canonical location per version** (e.g., v0.8: `.squad/skills/`, v0.9+: `.copilot/skills/`)
- **Read priority order** (which locations to check, in what order)
- **Write target** (where new content goes)
- **Deprecated paths** (old locations still valid for reading, but no longer written to)
- **Migration path** (what moves where on upgrade)
- **Criticality tier** (which files must exist for Squad to function)

All five code paths will reference this manifest instead of hardcoding paths independently.

### Benefits

- **Version-to-version changes tracked in ONE place** — manifest is the single source of truth
- **Migrations auto-derived from manifest diffs** — no manual step generation
- **Runtime read logic generated from manifest** — no more either/or bugs
- **Agent instructions reference manifest** — correct write paths guaranteed
- **`squad doctor` validates layout** — catch drift before it becomes bugs
- **Backward compatibility baked in** — deprecated paths + readOrder enable smooth upgrades
- **New code can't introduce silent-skip bugs** — CI gates catch the pattern

---

## Manifest Schema

### Version Inventory

Based on historical analysis, Squad's directory structure has evolved through these versions:

| Version Range | Layout Changes | Affected Files |
|---------------|----------------|----------------|
| ≤ v0.7 | `.ai-team/` era | agents, decisions, skills, history |
| v0.8–v0.8.x | Migrated to `.squad/` | All user-owned content moved |
| v0.9+ | Migrated to `.copilot/` | Skills, decisions, agents moved; `.squad/` deprecated but supported for reading |

### Schema Definition

```jsonc
{
  "schemaVersion": "1.0.0",
  "comment": "Defines all directories Squad manages, indexed by purpose",
  
  "layout": {
    // Each entry: directory purpose → location info + migration path
    
    "skills": {
      "purpose": "Reusable patterns and process knowledge",
      "canonical": ".copilot/skills/",
      "readOrder": [".copilot/skills/", ".squad/skills/", ".ai-team/skills/"],
      "writeTarget": ".copilot/skills/",
      "deprecated": [".squad/skills/", ".ai-team/skills/"],
      "since": "0.9.0",
      "tier": "important",
      "validate": "dir-exists && non-empty"
    },
    
    "decisions": {
      "purpose": "Recorded team decisions and ADRs",
      "canonical": ".copilot/decisions/",
      "readOrder": [".copilot/decisions/", ".squad/decisions/", ".ai-team/decisions/"],
      "writeTarget": ".copilot/decisions/",
      "deprecated": [".squad/decisions/", ".ai-team/decisions/"],
      "since": "0.9.0",
      "tier": "important",
      "validate": "dir-exists"
    },
    
    "agents": {
      "purpose": "Agent configurations and team roster",
      "canonical": ".copilot/agents/",
      "readOrder": [".copilot/agents/", ".squad/agents/", ".ai-team/agents/"],
      "writeTarget": ".copilot/agents/",
      "deprecated": [".squad/agents/", ".ai-team/agents/"],
      "since": "0.9.0",
      "tier": "critical",
      "validate": "dir-exists"
    },
    
    "config": {
      "purpose": "Squad configuration and team metadata",
      "canonical": ".squad/",
      "readOrder": [".squad/"],
      "writeTarget": ".squad/",
      "deprecated": [],
      "since": "0.8.0",
      "tier": "critical",
      "validate": "dir-exists"
    },
    
    "agent-md": {
      "purpose": "GitHub Copilot agent discovery file (critical for Copilot integration)",
      "canonical": ".github/agents/squad.agent.md",
      "readOrder": [".github/agents/squad.agent.md"],
      "writeTarget": ".github/agents/squad.agent.md",
      "deprecated": [],
      "since": "0.8.0",
      "tier": "critical",
      "validate": "file-exists && non-empty && contains-markers",
      "markers": ["Squad", "agent", "system"]
    },
    
    "ci-config": {
      "purpose": "Squad CI workflow configuration",
      "canonical": ".github/workflows/squad-ci.yml",
      "readOrder": [".github/workflows/squad-ci.yml"],
      "writeTarget": ".github/workflows/squad-ci.yml",
      "deprecated": [],
      "since": "0.8.0",
      "tier": "important",
      "validate": "file-exists && non-empty"
    }
  },
  
  "migrations": {
    "0.7-to-0.8": [
      { "from": ".ai-team/skills", "to": ".squad/skills", "action": "move" },
      { "from": ".ai-team/decisions", "to": ".squad/decisions", "action": "move" }
    ],
    "0.8-to-0.9": [
      { "from": ".squad/skills", "to": ".copilot/skills", "action": "move" },
      { "from": ".squad/decisions", "to": ".copilot/decisions", "action": "move" },
      { "from": ".squad/agents", "to": ".copilot/agents", "action": "move" }
    ]
  }
}
```

### Tier Definitions

| Tier | Definition | Validation | On Missing | Examples |
|------|-----------|------------|-----------|----------|
| **critical** | Product non-functional without it | `fail` in doctor | Block operation | `squad.agent.md`, `squad-ci.yml`, `.copilot/agents/` |
| **important** | Feature degraded without it | `warn` in doctor | Continue with warning | `casting-registry.json`, skill templates, decisions |
| **scaffolding** | Convenience, recreatable | `info` in doctor | Note in log only | `charter.md` template, `history.md` template |

---

## Architecture Principles

### Fail-Loud Policy

Operations on critical files must **NEVER silently skip**. Every code path that writes, copies, or modifies a critical file must have an explicit else clause:

```typescript
// ❌ Current pattern — silent degradation
if (storage.existsSync(source)) {
  storage.copySync(source, dest);
}

// ✅ Required pattern — fail-loud
if (storage.existsSync(source)) {
  storage.copySync(source, dest);
} else {
  warn(`Template source missing for critical file: ${dest}`);
  warnings.push({ file: dest, reason: 'template-source-missing' });
}
```

For `critical` tier files, missing source should be an error. For `important` and `scaffolding` tiers, a warning suffices.

### Empty = Missing

Existence checks must also verify non-empty for critical files. An empty `squad.agent.md` is functionally identical to a missing one — Copilot can't discover the agent — yet must not be silently skipped.

```typescript
// ❌ Insufficient — passes for empty files
expect(storage.existsSync(agentPath)).toBe(true);

// ✅ Required — catches empty files
expect(storage.existsSync(agentPath)).toBe(true);
const content = storage.readSync(agentPath);
expect(content.trim().length).toBeGreaterThan(0);
```

### Post-Operation Validation

After any operation that modifies the repo structure (`init`, `upgrade`, `migrate`, `doctor`), validate all critical files:

```typescript
function validateCriticalFiles(projectRoot: string): ValidationResult {
  const results: FileValidation[] = [];
  for (const [name, entry] of Object.entries(manifest.layout)) {
    const fullPath = path.join(projectRoot, entry.canonical);
    const exists = storage.existsSync(fullPath);
    const content = exists ? storage.readSync(fullPath) : null;
    const nonEmpty = content !== null && content.trim().length > 0;
    const markersPresent = entry.markers?.every(m => content?.includes(m)) ?? true;
    
    results.push({
      name,
      path: entry.canonical,
      tier: entry.tier,
      valid: exists && nonEmpty && markersPresent,
    });
  }
  return { results, allValid: results.every(r => r.valid) };
}
```

This runs as the final step in `init()`, `upgrade()`, and `migrate()` — after all file operations complete.

### Recovery Cascade

When post-operation validation fails for a critical file, attempt recovery before erroring:

1. **Try restore from template** — re-copy from template source
2. **Try restore from git** — `git show HEAD:<path>` to recover from last commit
3. **Error with clear message** — if both fail, surface a specific, actionable error

```typescript
async function recoverCriticalFile(entry: LayoutEntry, projectRoot: string): Promise<boolean> {
  const dest = path.join(projectRoot, entry.canonical);

  // Attempt 1: Restore from template
  const templatePath = resolveTemplatePath(entry.templateSource);
  if (templatePath && storage.existsSync(templatePath)) {
    storage.copySync(templatePath, dest);
    warn(`Recovered ${entry.canonical} from template`);
    return true;
  }

  // Attempt 2: Restore from git
  try {
    const content = execSync(`git show HEAD:${entry.canonical}`, { cwd: projectRoot });
    if (content.toString().trim().length > 0) {
      storage.writeSync(dest, content.toString());
      warn(`Recovered ${entry.canonical} from git history`);
      return true;
    }
  } catch { /* file not in git history */ }

  // Attempt 3: Error with actionable message
  throw new Error(`Critical file missing and unrecoverable: ${entry.canonical}. Reinstall Squad or manually restore this file.`);
}
```

### Doctor Severity = Actual Impact

`squad doctor` severity must match actual user impact:

| Condition | Correct Severity | Rationale |
|-----------|-----------------|-----------|
| File missing, product broken | `fail` | User can't use Squad |
| File empty, product broken | `fail` | Functionally identical to missing |
| File exists but malformed | `warn` | May partially work |
| File missing, product works | `warn` | Degraded but functional |
| File missing, convenience only | `info` | No impact on core functionality |

**Bug fix:** Empty `squad.agent.md` must report `fail`, not `warn`.

---

## Implementation Strategy: Strangler Fig Pattern

We will not freeze the product. Instead, we will wrap the old system incrementally:

### Step 1: Manifest Describes Current Reality
- Write manifest JSON that documents the current layout
- No behavior changes; manifest is purely descriptive
- Ship and gather feedback
- Risks: none — read-only

### Step 2: Path Resolver Reads from Manifest
- New `resolvePathFromManifest()` function references the manifest
- Old paths still work; resolver logs warnings on drift
- New code (future features) uses manifest resolver from day 1
- Risks: low — old paths unchanged

### Step 3: New Code Uses Manifest
- When touching existing files for unrelated work, swap hardcoded paths to manifest lookups
- No refactor of entire codebase at once
- Incremental migration reduces regression risk

### Step 4: Migrate Remaining Paths
- Migrate remaining hardcoded paths when natural opportunities arise (other PRs, bug fixes)
- All five code paths eventually reference the manifest
- Strangler fig: old paths can be removed once all callers migrated

---

## Implementation Phases

### Phase 1: Foundation — Manifest Schema + Path Resolver + Doctor Integration
**Objective:** Establish the foundation for the Squad manifest-driven architecture.

**Deliverables:**
- [ ] Manifest JSON schema defined at `lib/manifest.json` with 6+ directory entries
- [ ] Path resolver function `resolvePathFromManifest(name: string): string[]` implemented in `lib/path-resolver.ts`
- [ ] `doctor.ts` refactored to use `CriticalFileRegistry` from manifest; critical files report `fail` on missing/empty
- [ ] Manifest validation run as final step in `init()`, `upgrade()`, and `doctor`
- [ ] 20+ new unit tests covering manifest resolution and doctor severity alignment
- [ ] Documentation: how to add new managed directories to the manifest

**Files Modified:**
- `lib/manifest.json` (new)
- `lib/path-resolver.ts` (new)
- `packages/squad-cli/src/cli/commands/doctor.ts` (update severity logic)
- `packages/squad-cli/src/cli/core/init.ts` (add validation step)
- `packages/squad-cli/src/cli/core/upgrade.ts` (add validation step)
- `test/` (add 20+ tests)

**Risks & Mitigations:**
- Risk: Manifest schema missing entries → Mitigation: Enumerate all entries from existing code paths first
- Risk: Doctor false positives → Mitigation: Test against real repos (forks, samples)
- Risk: Performance regression (extra validation step) → Mitigation: Cache manifest during operation

**Definition of Done:**
- All tests pass
- No silent-skip patterns remain in doctor
- Manifest accurately describes all currently managed files
- New feature work can immediately adopt manifest resolver

---

### Phase 2: Code Migration — Audit Skill + Runtime Integration + CI Gate
**Objective:** Migrate the five code paths to use the manifest; add CI gates preventing regression.

**Deliverables:**
- [ ] `skill-source.ts` refactored to resolve paths from manifest (reads all directories in readOrder, merges results)
- [ ] `resolver.ts` refactored to use manifest (returns all matches, not just first)
- [ ] `upgrade.ts` migration logic refactored to derive migrations from manifest diffs
- [ ] `migrations.ts` updated to use manifest resolver for backward-compatible path lookups
- [ ] `squad.agent.md` updated with manifest-derived paths (calls to resolve functions)
- [ ] PR #669 skills merge fix integrated (merge both directories, `.copilot/` wins on conflicts)
- [ ] PR #731 failing tests fixed (3 code paths in upgrade/init/doctor now fail-loud)
- [ ] New CI job: `critical-file-check` — validates all critical files exist and are non-empty after init
- [ ] Test matrix: 11 scenarios covering all version transitions + edge cases
- [ ] 40+ new tests for migration logic and path resolution

**Files Modified:**
- `packages/squad-cli/src/runtime/skill-source.ts` (migrate to manifest)
- `packages/squad-cli/src/runtime/resolver.ts` (migrate to manifest)
- `packages/squad-cli/src/cli/core/upgrade.ts` (migration derivation)
- `packages/squad-cli/src/cli/core/migrations.ts` (use manifest resolver)
- `templates/squad.agent.md.template` (reference manifest)
- `.github/workflows/squad-ci.yml` (add critical-file-check job)
- `test/` (add 40+ tests)

**Related Work:**
- Closes #77 (skills discovery bug) — both directories now consulted
- Closes #730 (squad.agent.md silent deletion) — fail-loud on missing template
- Closes #731 (failing tests) — all three code paths now have else clauses

**Test Matrix: 11 Core Scenarios**
1. Init on v0.7 repo (legacy `.ai-team/` layout) → reads from deprecated paths
2. Init on v0.8 repo (`.squad/` layout) → reads from deprecated paths, writes to canonical
3. Init on v0.9 repo (`.copilot/` layout) → reads and writes to canonical
4. Upgrade v0.7 → v0.8 → verify migration moves files
5. Upgrade v0.8 → v0.9 → verify migration moves files + canonical paths used
6. Upgrade v0.9 → v0.9 (same version) → agent file preserved
7. Skills resolution: both `.squad/` and `.copilot/` present → reads both, `.copilot/` wins on conflicts
8. Skills resolution: only `.squad/` present → reads `.squad/`
9. Skills resolution: only `.copilot/` present → reads `.copilot/`
10. Upgrade with missing template source for critical file → fail-loud, not silent skip
11. Doctor: empty agent file → reports `fail`, not `warn`

**Risks & Mitigations:**
- Risk: Backward compatibility breaks for `.squad/` users → Mitigation: readOrder includes deprecated paths
- Risk: Merge conflicts in skills when both directories present → Mitigation: `.copilot/` wins deterministically
- Risk: Migration corrupts files → Mitigation: pre-migration backup, rollback on validation failure
- Risk: Large-scale refactor introduces regressions → Mitigation: phased rollout, feature flag

**Definition of Done:**
- All 11 scenarios pass
- PR #669 (skills merge fix) or equivalent integrated
- PR #731 (failing tests) all green
- CI job prevents future regressions
- No breaking changes for existing `.squad/` or `.ai-team/` repos

---

### Phase 3: Automation — Migration Generator + Doctor Repair
**Objective:** Automate future layout changes and enable self-service CLI troubleshooting.

**Deliverables:**
- [ ] Migration code generator: `generateMigrationFromManifest(fromVersion, toVersion)` returns migration steps
- [ ] Manifest backfill tool: for users on v0.7/v0.8, automatically generate manifest from observed file layout
- [ ] `squad doctor --repair` command: auto-fixes common issues (missing critical files, deprecated path usage, empty files)
- [ ] Manifest versioning: support multiple manifest versions in git history for diff/review
- [ ] Migration testing CLI: `squad test-migration --from-version X --to-version Y` simulates upgrade without modifying repo
- [ ] 20+ tests covering migration generation, backfill, and repair logic

**Files Modified:**
- `lib/migration-generator.ts` (new)
- `lib/manifest-backfill.ts` (new)
- `packages/squad-cli/src/cli/commands/doctor.ts` (add --repair flag)
- `packages/squad-cli/src/cli/commands/test.ts` (new subcommand: test-migration)
- `.github/workflows/squad-ci.yml` (add manifest version sanity check)
- `test/` (add 20+ tests)

**Risks & Mitigations:**
- Risk: Auto-repair makes wrong decisions → Mitigation: --dry-run flag, manual confirmation for destructive ops
- Risk: Migration generator produces incorrect steps → Mitigation: validate generated migrations against manifest schema
- Risk: Backfill creates incorrect manifest for unknown old layouts → Mitigation: require --confirm flag, log inferred structure

**Definition of Done:**
- Future layout changes can be described in manifest only; migration code auto-generated
- Users can self-service repair broken layouts with `squad doctor --repair`
- Migration testing available for validation before shipping

---

## What's Complete vs What's Remaining

### Done (from related work)
- ✅ **#730 Investigation:** Identified 3 failing code paths causing silent deletion
- ✅ **#731 Tests:** PR with 4 failing tests proving the bugs
- ✅ **#669 Skills Merge:** PR with fix (merge both directories), closed not merged upstream
- ✅ **Resilience PRD:** `docs/proposals/critical-file-resilience.md` with 5 architecture principles
- ✅ **Fork Issues:** Investigation in diberry/squad #77-82 (root cause + proposed solution)

### Remaining
- ⏳ **Phase 1:** Manifest schema + path resolver + doctor integration (2–3 weeks)
- ⏳ **Phase 2:** Migrate all 5 code paths + integrate PR #669 + fix #730/#731 (3–4 weeks)
- ⏳ **Phase 3:** Migration generator + doctor repair + backfill (2–3 weeks)

### Immediate Actions
1. Land PR #731 (failing tests for #730) to unblock Phase 1 deliverables
2. Merge or re-implement PR #669 (skills merge fix) into Phase 2
3. Create upstream Phase 1/2/3 issues (fork #80-82 don't exist upstream)

---

## Success Criteria

| Criterion | Phase | Measurable |
|-----------|-------|-----------|
| All #731 tests pass | 2 | CI green on PR #731 |
| No critical file can silently disappear during any CLI operation | 2 | `validateCriticalFiles()` runs after every init/upgrade/migrate |
| `squad doctor` accurately reports all broken states as `fail` | 2 | Doctor severity derived from `CriticalFileRegistry` tier |
| CI blocks PRs that introduce new silent-skip patterns | 2 | `check-critical-files` job in squad-ci.yml green |
| Skills discovery works with both `.squad/` and `.copilot/` present | 2 | All 11 test scenarios pass |
| Recovery cascade restores critical files from template or git | 2 | Round-trip tests (init → delete → upgrade → verify) pass |
| Users can migrate old layouts with `squad doctor --repair` | 3 | Repair command works on v0.7/v0.8 repos |
| Future layout changes require only manifest update | 3 | Migration code can be auto-generated from manifest diff |

---

## Related Work & References

| # | Repo | Type | Title | Status | Relationship |
|---|------|------|-------|--------|--------------|
| #670 | upstream | Issue | PRD: Versioned file layout manifest | **OPEN (THIS)** | Main proposal |
| #730 | upstream | Issue | squad.agent.md silently disappears | OPEN | Example of root cause; Phase 2 deliverable |
| #731 | upstream | PR | Failing tests for #730 | OPEN | Needs Phase 1 foundation to pass |
| #732 | upstream | Issue | Critical file resilience framework | CLOSED | Rolled into #670 |
| #669 | upstream | PR | Fix: merge skills from both directories | CLOSED (not merged) | Needed for Phase 2 |
| #77 | fork | Issue | Skills discovery bug | OPEN | Root cause that motivated #670 |
| #78–82 | fork | Issues | Phases 1–3 architecture + implementation | OPEN | Fork investigation; needs upstream issues |

---

## Backward Compatibility & Customer Impact

### Guarantee to Existing Users

- **`.ai-team/` repos (v0.7):** Manifest includes `.ai-team/` in `readOrder`; users can upgrade without moving files manually
- **`.squad/` repos (v0.8):** Manifest includes `.squad/` in `readOrder` and `deprecated`; automatic migration on upgrade moves files to `.copilot/`, old paths still readable during transition
- **`.copilot/` repos (v0.9+):** Canonical layout; no changes
- **No breaking changes:** All upgrades are forward-compatible; deprecated paths remain readable for 1–2 versions

### Upgrade Communication Plan

1. **Release notes:** "Squad now unifies directory structure across versions. Your layout will auto-migrate on `squad upgrade`."
2. **Migration log:** `squad upgrade` output includes: "Migrated skills from `.squad/skills/` to `.copilot/skills/`" for transparency
3. **Doctor report:** `squad doctor` reports any deprecated paths still in use, with clear migration guidance
4. **Rollback story:** If migration fails, recover with `squad doctor --repair` or revert to previous version

---

## Out of Scope

- **User-owned files** (`overwriteOnUpgrade: false`): `team.md`, `routing.md`, `decisions/`, agent histories, identity files. These are user content — the framework doesn't overwrite or validate them.
- **Runtime agent behavior:** This proposal covers CLI file operations only (init, upgrade, migrate, doctor). Agent runtime logic (how agents read/write at runtime) is a separate concern.
- **Non-file invariants:** Config validation, schema enforcement, and other non-filesystem concerns are outside this proposal's scope.

---

## Open Questions & Decisions Needed

1. **Manifest versioning:** Should we store multiple manifest versions in git history, or just the latest? (Recommendation: latest only; migrations derive from code history)
2. **Manifest distribution:** Should users commit the manifest to their repos, or is it CLI-only? (Recommendation: Users don't need to commit; it's shipped with CLI)
3. **Manifest auto-update on CLI upgrade:** Should `squad upgrade` fetch latest manifest from CLI release? (Recommendation: Yes, like TEMPLATE_MANIFEST)
4. **Concurrent access:** How do we handle two Squad sessions (e.g., git worktrees) reading/writing manifest simultaneously? (Recommendation: Use git locks; out of scope for Phase 1)

---

## Implementation Notes for Teams

### For Flight (Lead)
- Validate that manifest schema covers all 5 code paths
- Approve architecture principles (fail-loud, post-op validation, recovery cascade)
- Confirm phases are properly sequenced and non-blocking

### For EECOM (Core Dev)
- Technical feasibility of manifest-driven path resolution in init.ts, resolver.ts, skill-source.ts
- Performance impact of post-operation validation (should be negligible)
- Code review for Phase 1 & 2 implementation

### For FIDO (Quality Owner)
- Test plan adequacy; ensure 11 scenarios cover all version transitions
- Negative path coverage mandate: every critical file write must have "template missing" test
- CI gates (silent-skip grep check, template coverage check, doctor severity audit)

### For Procedures (Prompt Engineer)
- Impact on squad.agent.md agent instructions
- Update instructions to reference manifest-driven paths
- Ensure agent doesn't hardcode deprecated paths in new generated content

### For PAO (DevRel)
- Upgrade communication plan (release notes, docs, migration guide)
- Blog post: "How Squad Now Manages File Layout"
- FAQ: "How do I migrate my `.squad/` repo to `.copilot/`?"


Version Range	Layout Changes	Affected Files
≤ v0.7	`.ai-team/` era	agents, decisions, skills, history
v0.8–v0.8.x	Migrated to `.squad/`	All user-owned content moved
v0.9+	Migrated to `.copilot/`	Skills, decisions, agents moved; `.squad/` deprecated but supported for reading

Tier	Definition	Validation	On Missing	Examples
critical	Product non-functional without it	`fail` in doctor	Block operation	`squad.agent.md`, `squad-ci.yml`, `.copilot/agents/`
important	Feature degraded without it	`warn` in doctor	Continue with warning	`casting-registry.json`, skill templates, decisions
scaffolding	Convenience, recreatable	`info` in doctor	Note in log only	`charter.md` template, `history.md` template

Condition	Correct Severity	Rationale
File missing, product broken	`fail`	User can't use Squad
File empty, product broken	`fail`	Functionally identical to missing
File exists but malformed	`warn`	May partially work
File missing, product works	`warn`	Degraded but functional
File missing, convenience only	`info`	No impact on core functionality

Criterion	Phase	Measurable
All #731 tests pass	2	CI green on PR #731
No critical file can silently disappear during any CLI operation	2	`validateCriticalFiles()` runs after every init/upgrade/migrate
`squad doctor` accurately reports all broken states as `fail`	2	Doctor severity derived from `CriticalFileRegistry` tier
CI blocks PRs that introduce new silent-skip patterns	2	`check-critical-files` job in squad-ci.yml green
Skills discovery works with both `.squad/` and `.copilot/` present	2	All 11 test scenarios pass
Recovery cascade restores critical files from template or git	2	Round-trip tests (init → delete → upgrade → verify) pass
Users can migrate old layouts with `squad doctor --repair`	3	Repair command works on v0.7/v0.8 repos
Future layout changes require only manifest update	3	Migration code can be auto-generated from manifest diff

#	Repo	Type	Title	Status	Relationship
#670	upstream	Issue	PRD: Versioned file layout manifest	OPEN (THIS)	Main proposal
#730	upstream	Issue	squad.agent.md silently disappears	OPEN	Example of root cause; Phase 2 deliverable
#731	upstream	PR	Failing tests for #730	OPEN	Needs Phase 1 foundation to pass
#732	upstream	Issue	Critical file resilience framework	CLOSED	Rolled into #670
#669	upstream	PR	Fix: merge skills from both directories	CLOSED (not merged)	Needed for Phase 2
#77	fork	Issue	Skills discovery bug	OPEN	Root cause that motivated #670
#78–82	fork	Issues	Phases 1–3 architecture + implementation	OPEN	Fork investigation; needs upstream issues

PRD: Versioned file layout manifest - single source of truth for directory structure #670

Description

PRD: Versioned File Layout Manifest — Single Source of Truth for Directory Structure

Executive Summary

Problem Statement

Root Cause: Five Independent Code Paths Without Single Source of Truth

Evidence: Two Independent Bugs with Same Root Cause

Bug 1: Skills Discovery (#77)

Bug 2: squad.agent.md Silent Deletion (#730)

Systemic Pattern: Silent Degradation

Customer Impact

Proposed Solution: Versioned File Layout Manifest

Core Concept

Benefits

Manifest Schema

Version Inventory

Schema Definition

Tier Definitions

Architecture Principles

Fail-Loud Policy

Empty = Missing

Post-Operation Validation

Recovery Cascade

Doctor Severity = Actual Impact

Implementation Strategy: Strangler Fig Pattern

Step 1: Manifest Describes Current Reality

Step 2: Path Resolver Reads from Manifest

Step 3: New Code Uses Manifest

Step 4: Migrate Remaining Paths

Implementation Phases

Phase 1: Foundation — Manifest Schema + Path Resolver + Doctor Integration

Phase 2: Code Migration — Audit Skill + Runtime Integration + CI Gate

Phase 3: Automation — Migration Generator + Doctor Repair

What's Complete vs What's Remaining

Done (from related work)

Remaining

Immediate Actions

Success Criteria

Related Work & References

Backward Compatibility & Customer Impact

Guarantee to Existing Users

Upgrade Communication Plan

Out of Scope

Open Questions & Decisions Needed

Implementation Notes for Teams

For Flight (Lead)

For EECOM (Core Dev)

For FIDO (Quality Owner)

For Procedures (Prompt Engineer)

For PAO (DevRel)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions