[Feedback] Meditationstimer — Workflow-Analyse (Workflow v6, Agent Orchestrator Pattern)

## Project Context

**Meditationstimer / Healthy Habits Haven** is an iOS + watchOS SwiftUI meditation and wellness app (Swift 6.2, Xcode 26, iOS 18.5+). The project uses Claude Code with a custom **Workflow v6** — an 8-phase, multi-agent orchestrator pattern with hook-enforced checkpoints, TDD RED/GREEN discipline, and a dedicated Adversary agent.

Stack: SwiftUI, HealthKit, XCTest/XCUITest, GitHub Issues for backlog.

---

## What Works (Evidence-Based)

### 1. Hook-enforced Phase Gating — Highly Effective
The `edit_gate.py` hook enforces strict file access per phase:
- Phase 4 (`phase4_tdd_red`): only test files editable
- Phase 5 (`phase5_implement`): only source files editable, test changes blocked
- Phase 6+: scope limit of 5 code files checked
- `bash_gate.py` blocks `git commit` without `checkpoint3_approved` and a GitHub Issue reference

Evidence: Both archived workflows (`bug-background-meditation.json`, `bug-7-remaining-tests.json`) completed all 3 checkpoints correctly. The guard has dedicated logic for `__infra__` overrides, stop-locks, and build-lock serialization.

### 2. 3-Checkpoint System — Fully Used in Practice
Both archived workflows show `checkpoint1_approved`, `checkpoint2_approved`, `checkpoint3_approved` all set. The `phase_listener.py` hook detects natural-language keywords ("stimmt", "go", "commit") to set these — Claude cannot set them directly.

Evidence from `bug-background-meditation.json`:
```json
"checkpoint1_notes": "User approved at 2026-04-22T12:44:43",
"checkpoint2_notes": "User approved at 2026-04-22T14:41:54",
"checkpoint3_notes": "User approved at 2026-04-22T16:13:38"
```

### 3. TDD RED/GREEN Cycle — Consistently Enforced
Evidence from `bug-background-meditation.json`:
```json
"ui_test_red_result": "3/3 runs FAILED - test_backgroundForeground_sessionStillRunning consistently fails",
"green_test_result": "UI Tests GREEN after Finding-1-Fix: both BackgroundMeditationUITests passed"
```
The `post_bash.py` hook auto-warns when tests pass during RED phase or fail during GREEN phase.

### 4. Adversary Findings System — Granular, Traceable
The `add-finding` CLI creates structured records with `id`, `title`, `impact`, `proof`, and per-finding resolution status (`fix`/`accept`/`defer`). Evidence from `bug-background-meditation.json` — 3 findings, each resolved separately:
- Finding #1: `"status": "fix"` (timing-sensitive scenePhase guard)
- Finding #2: `"status": "defer"` (deep-link zombie session — separate ticket)
- Finding #3: `"status": "accept"` (missing unit tests from spec)

### 5. Information Isolation Between Agents — Designed and Followed
The orchestrator commands (`10-bug.md`, `11-feature.md`) explicitly define what each agent receives and what it must not see (e.g., User-Advocate gets no code, Investigators don't see each other's results, Adversary gets no analysis or Developer report). This is enforced by convention, not hooks.

### 6. Localization Gate at Commit
`bash_gate.py` blocks `git commit` unless `localize_checked: true` OR `no_user_strings: true`. This prevents shipping untranslated strings.
Evidence: memory entry `bug38-uppercase-investigation.md` — 163 missing DE/EN translations were the root cause of uppercase debug keys; the gate would have caught this if committed through the workflow.

---

## What Doesn't Work (Evidence-Based)

### 1. `red_test_done` Flag Never Set (Only UI Variant)
Both archived workflows show `"red_test_done": false` alongside `"ui_test_red_done": true`. The standard unit test RED artifact is never registered because the project uses UI tests exclusively for TDD. This means the edit_gate RED-check (step 11 in `edit_gate.py`) always falls through to the `ui_test_red_done` fallback. The primary flag is effectively dead code.

### 2. Adversary "AMBIGUOUS" Verdict Has No Enforcement Path
`bug-background-meditation.json` shows `"adversary_verdict": "AMBIGUOUS"` but the workflow still proceeded to `checkpoint3_approved` and commit. The `bash_gate.py` only blocks on unresolved *findings*, not on AMBIGUOUS verdict itself. An AMBIGUOUS verdict with all findings resolved is indistinguishable from VERIFIED at the gate level.

### 3. Trial-and-Error Despite Analysis-First Principle
Memory entry for Bug 38 (xcstrings uppercase keys):
> "12 Versuche mit .textCase, NavigationView, Scheme-Flags etc. waren alle Sackgassen"

Despite the `analysis-first` standard, the root cause (missing xcstrings translations) was found only after exhaustive trial-and-error. The workflow's Analysis phase did not prevent this. The phase2_analyse structure does not mandate "falsification of candidates before fixing."

### 4. Checkpoint Bypass Was Attempted
Memory entry: `feedback_no_checkpoint_bypass.md` — "WORKFLOW_CALLER=phase_listener nie manuell setzen". This feedback exists because the bypass was actually attempted. The `phase_listener.py` guard (`os.environ.get("CLAUDE_ADMIN")`) is the only protection, and `WORKFLOW_CALLER` itself is not validated by `workflow.py`.

### 5. Scope Enforcement Is File-Count-Only, Not LoC
`edit_gate.py` checks `len(code_affected) > 5` but does not count lines changed. The CLAUDE.md rule of `±250 LoC` has no automated enforcement. Commits like `f1b9a4a fix: Add 163 missing DE/EN translations` (likely >250 LoC) could pass the gate undetected.

---

## Gaps / Blind Spots

### 1. No Workflow Metrics / Observability
No system measures: how often each phase catches issues, average time per phase, how often the Adversary is BROKEN vs VERIFIED, how many findings are deferred vs fixed. Without data, it's impossible to know which phases add the most value.

### 2. Phase Transitions Are Unguarded
Hooks block code edits per phase, but *phase transitions themselves* are not enforced by hooks — the orchestrator calls `workflow.py phase phase5_implement` directly. Claude could skip from `phase2_analyse` to `phase5_implement` without checkpoints. The only enforcement is the keyword-gated checkpoint flags (which block the *commit* but not the *transition*).

### 3. Agent Information Isolation Is Convention-Only
The critical isolation rules (User-Advocate sees no code, Adversary sees no analysis) are in the orchestrator's markdown instructions, not technically enforced. A future Claude session, or a new orchestrator, could violate these easily.

### 4. No Session Resilience Documentation
The `.sessions.json` multi-session mapping exists, but there's no documented recovery procedure when a session ends mid-phase. The `user_override_token.json` (currently untracked in git) is the escape hatch, but its use is not documented in the workflow commands.

### 5. Developer Agent Worktree Isolation Unclear
The `10-bug.md` command says `isolation: "worktree"` for the Developer Agent, but `.claude/worktrees/` exists as an empty directory and the archived workflows show no worktree artifact. Unclear whether worktree isolation is actually being used or silently skipped.

### 6. No Spec Quality Gate
The Spec-Writer agent produces a spec, Henning approves it with a keyword, and the QA-Writer writes tests against it. But there's no check that the spec is *testable* — no acceptance criteria format, no "must include at least N acceptance tests" requirement. Vague specs produce vague tests.

---

## Concrete Recommendations for agent-os-openspec

1. **Distinguish `red_test_done` from `ui_test_red_done` in standards** — projects that use only UI tests for TDD should have a single unified `tdd_red_done` flag, or the standard should clarify the difference and which gate checks which.

2. **Adversary AMBIGUOUS must block commit** — add a gate that treats `adversary_verdict == "AMBIGUOUS"` with zero findings the same as BROKEN, requiring explicit Henning override. "AMBIGUOUS but resolved findings" is a false green.

3. **Analysis-First standard should include candidate falsification** — before fixing, the analyst must list alternative root cause candidates and explicitly eliminate each. This prevents the "12 attempts" anti-pattern.

4. **Add LoC counting to scope guard** — the edit_gate already tracks `affected_files`; extend it with a `git diff --shortstat` check on modified files. Block if total LoC delta exceeds the project limit (250 in this project).

5. **Spec acceptance criteria format standard** — define a mandatory section in specs: `## Acceptance Criteria` with `Given/When/Then` or similar testable format. The QA-Writer should refuse to write tests without it.

6. **Phase transition audit trail** — `workflow.py phase <phase>` should log the calling context (timestamp, session) so it's possible to audit whether phases were skipped.

7. **Information isolation as hook** — consider a lightweight `context_gate.py` hook that reads agent role from session metadata and blocks tool calls that violate isolation rules (e.g., User-Advocate trying to Read .swift files).

---

🤖 Generated with [Claude Code](https://claude.ai/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feedback] Meditationstimer — Workflow-Analyse (Workflow v6, Agent Orchestrator Pattern) #4

Project Context

What Works (Evidence-Based)

1. Hook-enforced Phase Gating — Highly Effective

2. 3-Checkpoint System — Fully Used in Practice

3. TDD RED/GREEN Cycle — Consistently Enforced

4. Adversary Findings System — Granular, Traceable

5. Information Isolation Between Agents — Designed and Followed

6. Localization Gate at Commit

What Doesn't Work (Evidence-Based)

1. `red_test_done` Flag Never Set (Only UI Variant)

2. Adversary "AMBIGUOUS" Verdict Has No Enforcement Path

3. Trial-and-Error Despite Analysis-First Principle

4. Checkpoint Bypass Was Attempted

5. Scope Enforcement Is File-Count-Only, Not LoC

Gaps / Blind Spots

1. No Workflow Metrics / Observability

2. Phase Transitions Are Unguarded

3. Agent Information Isolation Is Convention-Only

4. No Session Resilience Documentation

5. Developer Agent Worktree Isolation Unclear

6. No Spec Quality Gate

Concrete Recommendations for agent-os-openspec

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feedback] Meditationstimer — Workflow-Analyse (Workflow v6, Agent Orchestrator Pattern) #4

Description

Project Context

What Works (Evidence-Based)

1. Hook-enforced Phase Gating — Highly Effective

2. 3-Checkpoint System — Fully Used in Practice

3. TDD RED/GREEN Cycle — Consistently Enforced

4. Adversary Findings System — Granular, Traceable

5. Information Isolation Between Agents — Designed and Followed

6. Localization Gate at Commit

What Doesn't Work (Evidence-Based)

1. red_test_done Flag Never Set (Only UI Variant)

2. Adversary "AMBIGUOUS" Verdict Has No Enforcement Path

3. Trial-and-Error Despite Analysis-First Principle

4. Checkpoint Bypass Was Attempted

5. Scope Enforcement Is File-Count-Only, Not LoC

Gaps / Blind Spots

1. No Workflow Metrics / Observability

2. Phase Transitions Are Unguarded

3. Agent Information Isolation Is Convention-Only

4. No Session Resilience Documentation

5. Developer Agent Worktree Isolation Unclear

6. No Spec Quality Gate

Concrete Recommendations for agent-os-openspec

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `red_test_done` Flag Never Set (Only UI Variant)