Skip to content

refactor: codec pipeline restructuring and progressive disclosure infrastructure#40

Merged
darko-mijic merged 7 commits intomainfrom
refactor/progressive-disclosure-codec-pipeline
Apr 1, 2026
Merged

refactor: codec pipeline restructuring and progressive disclosure infrastructure#40
darko-mijic merged 7 commits intomainfrom
refactor/progressive-disclosure-codec-pipeline

Conversation

@darko-mijic
Copy link
Copy Markdown
Contributor

@darko-mijic darko-mijic commented Apr 1, 2026

Problem

The codec pipeline — the layer that transforms extracted pattern data into generated Markdown documentation — had accumulated structural debt that was increasing the cost of every new feature:

  1. 7-point registration ceremony. Adding a single new document codec required changes in 7 locations across 3 files: import groups, document type map, options interface, registry calls, and factory registration. This made codec development unnecessarily friction-heavy.

  2. Type system split-brain. Codecs received MasterDataset (the Zod-inferred extraction read model), but runtime context like workflow definitions and project metadata lived on a separate RuntimeMasterDataset extension. Codecs couldn't access runtime context through the type system.

  3. Monolithic reference codec. reference.ts at 2,019 lines mixed type definitions, static product-area metadata, section builders, diagram infrastructure, and factory logic in a single file — making navigation and focused changes difficult.

  4. Zero IndexCodec test coverage. The documentation entry point (INDEX.md generator) had no tests, making any upstream changes risky.

  5. Duplicated utilities and silent enum divergence. completionPercentage, normalizeImplPath, and architecture enum values were independently maintained in multiple locations.

Solution

CodecContext — Clean Separation of Extraction and Runtime Data

All codecs now receive a CodecContext wrapper instead of raw MasterDataset:

interface CodecContext {
  readonly dataset: MasterDataset;           // extraction products
  readonly projectMetadata?: ProjectMetadata; // config-derived identity
  readonly tagExampleOverrides?: Partial<Record<FormatType, { ... }>>;
}

This keeps MasterDataset as a pure extraction read model (ADR-006) while giving codecs typed access to project identity, regeneration commands, and tag example customizations. The Zod z.codec() boundary is bridged internally — codec authors work with CodecContext, the Zod layer is an implementation detail.

Self-Describing Codecs via CodecMeta

Each codec now exports a codecMeta object that carries all registration metadata:

export const codecMeta: CodecMeta = {
  type: 'patterns',
  outputPath: 'PATTERNS.md',
  description: 'Pattern registry with category details',
  factory: createPatternsCodec,
  defaultInstance: PatternsDocumentCodec,
};

A central barrel (codec-registry.ts) collects all meta exports. generate.ts auto-registers from the barrel. Adding a new codec is now 2 touch-points (codec file + barrel import) instead of 7.

ProjectMetadata — Customizable Project Identity

Generated docs (INDEX.md, TAXONOMY.md) previously hardcoded @libar-dev/architect, MIT, and specific pnpm regeneration commands. Now configurable via architect.config.ts:

export default defineConfig({
  project: {
    name: '@my-org/my-package',
    purpose: 'Event sourcing toolkit',
    license: 'Apache-2.0',
    regeneration: {
      commands: [
        { label: 'Regenerate all', command: 'pnpm docs:all' },
      ],
    },
  },
  tagExampleOverrides: {
    flag: { example: '@my-prefix-my-flag' },
  },
});

Codecs fall back to built-in defaults when metadata is absent — zero breaking changes for existing consumers.

IndexCodec Extensibility

Three new options make IndexCodec fully customizable without post-processing scripts:

  • purposeText — override the auto-generated document purpose
  • epilogue — replace the regeneration footer with custom SectionBlock[]
  • packageMetadataOverrides — override individual metadata table fields

Codec Consolidation via View Discriminant

Timeline codecs (Roadmap, Milestones, CurrentWork) and Session codecs (SessionContext, RemainingWork) are unified into single factories with a view discriminant:

createTimelineCodec({ view: 'completed' })  // was: createMilestonesCodec()
createSessionCodec({ view: 'remaining' })   // was: createRemainingWorkCodec()

Backward-compatible aliases preserve all existing exports and CLI generator names.

Progressive Disclosure Infrastructure

New render-layer infrastructure for auto-splitting oversized detail files:

  • SizeBudget and RenderOptions types separate presentation concerns from codec content decisions
  • splitOversizedDocument() splits at H2 boundaries with kebab-case sub-file paths, back-links, and parent link-out references
  • renderDocumentWithFiles() integrates splitting transparently — no RenderOptions means no splitting (fully backward compatible)

Reference Codec Decomposition

The 2,019-line reference.ts is split into 5 focused modules with an acyclic import chain:

Module Lines Responsibility
reference-types.ts 170 Interfaces, constants, config types
product-area-metadata.ts 328 Static product area mappings
reference-builders.ts 423 Section builder functions
reference-diagrams.ts 731 Mermaid diagram infrastructure
reference.ts 501 Factory, decode paths, re-exports

All previously exported symbols are re-exported from reference.ts for backward compatibility.

Additional Improvements

  • createDecodeOnlyCodec() helper eliminates ~200 lines of identical z.codec() + encode: () => throw boilerplate across 24 factories
  • GeneratorContext.masterDataset is now required — 6 unnecessary null-guards removed
  • archRole/archLayer shared constants eliminate silent enum divergence between the tag registry and Zod schema
  • behaviorCategories and conventionTags are now optional on ReferenceDocConfig (default: [])
  • Default output directory changed from docs/architecture to docs-live
  • bySource renamed to bySourceType in MasterDatasetSchema for naming accuracy
  • Recursive deep merge for codec options in the orchestrator — nested per-codec options merge correctly instead of being clobbered
  • Optional tagRegistry on PipelineOptions — eliminates redundant config load when the orchestrator already has a resolved config
  • backLink() and includesDetail() helpers added for progressive disclosure support
  • Vestigial grouping functions (groupByCategory, groupByPhase, groupByQuarter) removed — zero consumers, superseded by MasterDataset pre-computed views

Breaking Changes

Change Impact Migration
Default output directory → docs-live Config consumers that relied on the docs/architecture default Set output.directory explicitly
bySourcebySourceType on MasterDataset Direct MasterDataset consumers Rename field access
behaviorCategories/conventionTags now optional None — .default([]) handles it No action needed
GeneratorContext.masterDataset required TS consumers that constructed partial contexts Provide masterDataset (was always required at runtime)

Test Plan

  • pnpm typecheck — clean
  • pnpm lint — clean
  • pnpm test — 8,807 tests passing across 142 test files
  • pnpm docs:all — documentation generates without errors
  • All pre-commit hooks pass (lint-staged, typecheck, Process Guard)
  • Verify in consumer monorepo — ProjectMetadata defaults produce identical INDEX.md and TAXONOMY.md output

Summary by CodeRabbit

  • New Features

    • Project metadata (name/purpose/version/license) and regeneration commands; per-format tag example overrides
    • Index document: custom purpose text, epilogue, and package-metadata overrides
    • Reference docs: richer product-area content and automatic diagram generation
    • Automatic document splitting for oversized detail files
  • Configuration Changes

    • Default output directory changed from "docs/architecture" → "docs-live"
  • Removals

    • brief and isCore metadata tag support removed; brief-path behavior deleted

…shape directives

Remove @architect-brief (dead relic) and @architect-core flag (zero consumers),
fix hardcoded 'ddd' category fallback in Gherkin extractor, validate category
inference against registry, and filter shape-only directives at scanner level.

BREAKING CHANGE: @architect-brief tag removed, isCore field removed from schemas
…rastructure

8-phase refactoring of the codec pipeline based on architectural review:

Phase 0 — Safety Net:
- Add 101 IndexCodec regression tests (zero prior coverage)
- Extract archRole/archLayer to shared constants (eliminate enum divergence)
- Fix stale @architect-core flag example in TaxonomyCodec

Phase 1 — Structural Foundations:
- Add createDecodeOnlyCodec() helper eliminating ~200 lines of boilerplate
- Introduce CodecContext wrapper separating extraction from runtime context
- Decompose reference.ts (2,019 lines) into 5 focused modules
- Deduplicate normalizeImplPath, completionPercentage; remove vestigial grouping functions
- Add backLink() and includesDetail() helpers
- Make GeneratorContext.masterDataset required (remove 6 null-guards)

Phase 2 — Config Simplification (Spec 1):
- Add ProjectMetadata and RegenerationConfig types with Zod schemas
- Add tagExampleOverrides with FormatType-constrained keys
- Change default output directory from docs/architecture to docs-live
- Make behaviorCategories and conventionTags optional with .default([])
- Thread projectMetadata and tagExampleOverrides through CodecContext
- IndexCodec reads context.projectMetadata for name/purpose/license
- TaxonomyCodec overlays context.tagExampleOverrides on format examples

Phase 3 — IndexCodec Extensibility (Spec 2):
- Add purposeText, epilogue, packageMetadataOverrides options
- 2-level footer cascade: epilogue > projectMetadata.regeneration > built-in

Phase 4 — Self-Describing Codecs:
- Add CodecMeta interface for self-describing codec registration
- Add codecMeta/codecMetas exports to all 15+ codec files
- Create codec-registry.ts barrel collecting all meta exports
- Auto-register codecs from meta (~119 lines removed from generate.ts)

Phase 5 — Codec Consolidation (Spec 4):
- Unify Timeline 3→1 with view discriminant (all/completed/active)
- Unify Session 2→1 with view discriminant (context/remaining)
- Backward-compatible aliases preserve all existing exports

Phase 6 — Progressive Disclosure (Spec 3):
- Add SizeBudget, RenderOptions types in render-options.ts
- Add splitOversizedDocument() with H2-boundary splitting
- Integrate auto-splitting into renderDocumentWithFiles()

Phase 7 — Cleanup:
- Add optional tagRegistry on PipelineOptions (eliminate double config load)
- Remove cross-layer re-export from extractor/index.ts
- Rename bySource to bySourceType in MasterDataset
- Add deep merge for codec options in orchestrator

Net: 44 files modified, 10 created, -1,115 lines
…lag example

- Make deepMergeCodecOptions fully recursive via deepMergePlainObjects
  so nested per-codec options (e.g., index.packageMetadataOverrides)
  merge correctly instead of being clobbered at the second level.
- Restore @architect-sequence-error flag example in TaxonomyCodec
  (both buildFormatTypesSection and detail document) — Phase 2 agent
  overwrote Phase 0B's fix when it modified the same file in parallel.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Warning

Rate limit exceeded

@darko-mijic has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 53 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 3 minutes and 53 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f3448e3c-b369-4195-b18e-e709057ac042

📥 Commits

Reviewing files that changed from the base of the PR and between 8c3772d and b63215f.

📒 Files selected for processing (9)
  • src/generators/built-in/reference-generators.ts
  • src/renderable/codecs/claude-module.ts
  • src/renderable/codecs/product-area-metadata.ts
  • src/renderable/codecs/reference.ts
  • src/renderable/codecs/session.ts
  • src/renderable/codecs/timeline.ts
  • tests/features/doc-generation/index-codec.feature
  • tests/steps/behavior/cli/process-api-reference.steps.ts
  • tests/steps/doc-generation/index-codec.steps.ts
📝 Walkthrough

Walkthrough

Refactors many codecs to a shared decode-only DocumentCodec model with centralized codec metadata and runtime context enrichment; renames MasterDataset view bySourcebySourceType; removes isCore/brief metadata/tags; adds project metadata and tag-example overrides; changes default output directory to docs-live; introduces reference diagram/builder subsystems.

Changes

Cohort / File(s) Summary
Config & Resolve
src/config/project-config.ts, src/config/project-config-schema.ts, src/config/resolve-config.ts
Add optional project metadata and tagExampleOverrides; make some reference-doc fields default to []; change default output directory to docs-live.
Generator orchestration & context
src/generators/orchestrator.ts, src/generators/types.ts, src/generators/codec-based.ts, src/generators/pipeline/build-pipeline.ts
Thread tagRegistry, projectMetadata, tagExampleOverrides through generation APIs and GeneratorContext; require masterDataset in context; accept optional preloaded tagRegistry; deep-merge codec options.
Dataset view rename
src/generators/pipeline/transform-dataset.ts, src/validation-schemas/master-dataset.ts, tests/features/...transform-dataset*, tests/steps/...transform-dataset.steps.ts
Rename accumulator and exposed view from bySourcebySourceType; update schema and tests accordingly.
Codec base, registry & context enrichment
src/renderable/codecs/types/base.ts, src/renderable/codecs/codec-registry.ts, src/renderable/codecs/index.ts
Introduce CodecMeta, CodecContext, CodecContextEnrichment, runtime enrichment APIs, createDecodeOnlyCodec, and aggregate ALL_CODEC_METAS.
Codec factories → decode-only + metadata
src/renderable/codecs/* (many files, e.g., adr.ts,architecture.ts,patterns.ts,planning.ts,timeline.ts,session.ts,index-codec.ts,taxonomy.ts,requirements.ts,reporting.ts,validation-rules.ts,composite.ts,design-review.ts,pr-changes.ts,business-rules.ts,claude-module.ts)
Replace per-file Zod codec wiring with createDecodeOnlyCodec(...) returning DocumentCodec; add codecMeta/codecMetas exports; adapt builders to accept CodecContext (to use project metadata / tagExampleOverrides).
Reference subsystem
src/renderable/codecs/reference.ts, src/renderable/codecs/reference-types.ts, src/renderable/codecs/reference-builders.ts, src/renderable/codecs/reference-diagrams.ts, src/renderable/codecs/product-area-metadata.ts
Introduce reference-types, section builders, diagram builders (Mermaid), and product-area metadata; refactor reference.ts to orchestrate via these modules and re-export relevant types/values.
Generate & render pipeline
src/renderable/generate.ts, src/renderable/render.ts, src/renderable/split.ts, src/renderable/render-options.ts, src/renderable/index.ts
Auto-register codecs via ALL_CODEC_METAS; add contextEnrichment arg to generate APIs and set/clear enrichment; add RenderOptions/SizeBudget; implement H2-based document splitting and measure helper; update render entry to accept renderer or options.
Extraction & scanner changes
src/extractor/*, src/scanner/ast-parser.ts, src/scanner/gherkin-ast-parser.ts, src/cli/validate-patterns.ts
Stop extracting isCore/brief; tighten category fallback logic (use registry first category or uncategorized); skip shape-only directives lacking patternName/implements; update validate-patterns to read bySourceType.
Taxonomy & validation schemas
src/taxonomy/registry-builder.ts, src/taxonomy/arch-values.ts, src/validation-schemas/*.ts
Remove core/brief metadata tag declarations; add ARCH role/layer constants; remove isCore/brief fields from directive, process metadata, extracted-pattern, and related schemas/types.
Generators behavior changes
src/generators/built-in/reference-generators.ts, src/generators/built-in/design-review-generator.ts
Remove early-return guards for missing masterDataset so generators proceed to codec decoding (context assumes dataset present).
Validation & anti-patterns
src/validation/anti-patterns.ts, src/validation/types.ts
Add detector for removed tags (e.g., brief) producing removed-tag violations; extend AntiPatternId union.
Tests & fixtures
tests/features/*, tests/steps/*, tests/fixtures/*, tests/support/helpers/assertions.ts
Update tests/fixtures to reflect bySourceType and removed isCore/brief behavior; change default output dir expectation; add comprehensive IndexCodec feature/steps; adjust assertions and helper APIs accordingly.
Docs
docs/ARCHITECTURE.md
Document schema view rename bySourcebySourceType.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI / Orchestrator
  participant Pipeline as BuildPipeline
  participant Generator as Generator (context)
  participant CodecReg as ALL_CODEC_METAS
  participant Codec as DocumentCodec
  participant Renderer as Renderer

  CLI->>Pipeline: generateDocumentation(opts with tagRegistry/projectMetadata/overrides)
  Pipeline->>Generator: buildMasterDataset(...)
  Generator->>CodecReg: lookup codec meta for DocumentType
  Generator->>Codec: setCodecContextEnrichment(projectMetadata, tagExampleOverrides)
  CodecReg->>Codec: instantiate factory(options)
  Generator->>Codec: decode({ dataset, projectMetadata, tagExampleOverrides })
  Codec->>Renderer: return RenderableDocument
  Renderer->>Generator: render files (split if sizeBudget)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

"I nibbled through the codecs' maze,
BySource grew new BySourceType ways,
Metadata and registry aligned,
Project notes and tags entwined,
A rabbit hops—documents rise!" 🐇✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main architectural change: restructuring of the codec pipeline and addition of progressive disclosure infrastructure for document rendering.
Docstring Coverage ✅ Passed Docstring coverage is 95.83% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/progressive-disclosure-codec-pipeline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

… runtime guard

- Make TagExampleOverridesSchema values optional so partial configs
  (e.g., only overriding 'enum' example) pass Zod validation.
  z.record(z.enum(...), schema) requires ALL keys in Zod 4 — wrapping
  the value schema in .optional() allows missing keys.
- Restore defensive runtime guard in CodecBasedGenerator for plain JS
  consumers that may omit masterDataset despite the required TS type.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/renderable/codecs/taxonomy.ts (1)

426-455: ⚠️ Potential issue | 🟠 Major

Stop falling back to literal @architect-* examples here.

The new override layer still defaults to hardcoded tags and values like @architect-pattern, @architect-status roadmap, and @architect-sequence-error. Any project using a non-default tag prefix or registry-defined values will therefore generate incorrect taxonomy docs unless every format is overridden manually. Please derive the fallback examples from the active tag registry/config and apply tagExampleOverrides only as the final override layer.

As per coding guidelines, src/**/*.{ts,tsx}: Use tag registry in src/taxonomy/ to define category names, status values, and tag formats; never hardcode tag values in source files.

Also applies to: 681-749

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderable/codecs/taxonomy.ts` around lines 426 - 455, The defaults
object in buildFormatTypesSection (and the similar block at lines 681-749)
hardcodes `@architect-`* examples; replace those hardcoded fallback examples with
values derived from the active tag registry/config (use the existing
taxonomy/tag-registry API in src/taxonomy/) and only apply
context.tagExampleOverrides as the final layer in getFormatInfo so that
description/example defaults come from the registry (e.g., category names,
allowed enum values, and tag prefixes) rather than literal strings; update
buildFormatTypesSection, the defaults variable, and getFormatInfo to consult the
tag registry helper functions/classes (instead of literal strings) and keep
overrides?.[format] as the last override step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/project-config-schema.ts`:
- Around line 155-157: TagExampleOverridesSchema currently uses
z.record(z.enum(FORMAT_TYPES), TagExampleOverrideSchema) which in Zod 4 requires
all FORMAT_TYPES keys when the field is present; change it to use
z.partialRecord(z.enum(FORMAT_TYPES), TagExampleOverrideSchema) so consumers can
supply partial overrides (e.g., only one format key) while keeping validation
for provided keys; update the TagExampleOverridesSchema definition to use
z.partialRecord and ensure imports/usage of FORMAT_TYPES and
TagExampleOverrideSchema remain the same.

In `@src/renderable/codecs/claude-module.ts`:
- Around line 104-107: The factory createClaudeModuleCodec currently merges
DEFAULT_CLAUDE_MODULE_OPTIONS which still sets fullDocsPath to the old "docs/"
causing emitted _claude-md modules to point to broken links; update
DEFAULT_CLAUDE_MODULE_OPTIONS to set fullDocsPath to "docs-live/" (or derive it
from the configured output directory if available via DEFAULT_BASE_OPTIONS or
the options parameter) so createClaudeModuleCodec produces correct "Full
Documentation" links—modify the DEFAULT_CLAUDE_MODULE_OPTIONS constant (and, if
preferable, make mergeOptions prefer a derived path from
DEFAULT_BASE_OPTIONS/output config) and ensure fullDocsPath is used by
buildClaudeModuleDocument.

In `@src/renderable/codecs/pr-changes.ts`:
- Line 6: Add the required docs scanner annotations by inserting an opt-in tag
and runtime-dependency tags at the top of this file: add a top-of-file comment
containing `@libar-docs` to opt this file into scanning and add one
`@libar-docs-uses` entry per runtime import used by the codec helpers
(specifically for symbols like createDecodeOnlyCodec and any helpers referenced
in the updated import block around lines 45-72); populate each `@libar-docs-uses`
line with the exact import specifier strings from the file’s import statements
so the docs scanner knows the runtime dependencies.

In `@src/renderable/codecs/product-area-metadata.ts`:
- Around line 25-38: PRODUCT_AREA_ARCH_CONTEXT_MAP and PRODUCT_AREA_META are
typed as open-ended Record<string,...>, which allows typos or unsupported
`@architect-product-area` values; replace the loose string keys with the taxonomy
source-of-truth so the compiler enforces valid product-area names. Import the
product-area tag registry (or export a shared const union/enum) from
src/taxonomy and use that type as the key (e.g., Record<ProductAreaTag, readonly
string[]> / Record<ProductAreaTag, ProductAreaMeta>) or build both maps from a
single shared const list of product-area keys so PRODUCT_AREA_ARCH_CONTEXT_MAP
and PRODUCT_AREA_META only accept known product-area identifiers and will error
at compile time for missing/extra keys.

In `@src/renderable/codecs/session.ts`:
- Around line 207-210: The file has import statements for
renderAcceptanceCriteria and renderBusinessRulesSection placed after constant
declarations; move those imports into the top import block with the other
imports (alongside existing imports like toKebabCase) so all imports are grouped
before any code/constant definitions, then remove the duplicate late import
lines referencing renderAcceptanceCriteria and renderBusinessRulesSection.

In `@src/renderable/codecs/timeline.ts`:
- Line 207: The import for renderAcceptanceCriteria and
renderBusinessRulesSection is placed after constant declarations; move the line
"import { renderAcceptanceCriteria, renderBusinessRulesSection } from
'./helpers.js';" up with the other top-of-file imports so all imports are
grouped together (same fix you applied in session.ts), ensuring functions
referenced in this module (renderAcceptanceCriteria, renderBusinessRulesSection)
are imported before use.

In `@src/renderable/codecs/types/base.ts`:
- Around line 206-224: The module-level mutable variable _contextEnrichment and
its setters setCodecContextEnrichment / clearCodecContextEnrichment are
documented but lack an explicit thread-safety warning; update the comment near
_contextEnrichment (and the JSDoc for
setCodecContextEnrichment/clearCodecContextEnrichment) to explicitly state that
this global state is only safe under synchronous single-threaded use (e.g.,
within generate.ts when calling codec.decode()), warn against usage from worker
threads or concurrent async codecs, and recommend passing a
CodecContext/CodecContextEnrichment object through function parameters instead
of relying on module-level state for any future async/worker usage.

In `@src/renderable/codecs/validation-rules.ts`:
- Around line 232-235: The decode callback passed to createDecodeOnlyCodec
currently ignores the CodecContext (_context) and always uses raw
RULE_DEFINITIONS via buildValidationRulesDocument, which prevents
convention-derived rationale from reaching includeErrorGuide; change the
callback in createDecodeOnlyCodec to accept the provided CodecContext, call
composeRationaleIntoRules(context, RULE_DEFINITIONS) (or similar) to produce
enriched rules, then pass those enriched rules into buildValidationRulesDocument
so RuleDefinition.rationale and alternatives from composeRationaleIntoRules flow
through to includeErrorGuide.

In `@src/renderable/render-options.ts`:
- Line 21: The exported DEFAULT_SIZE_BUDGET object is mutable at runtime and
should be frozen to prevent cross-module mutation; update the export so
DEFAULT_SIZE_BUDGET is created and frozen (e.g., via Object.freeze(...)) while
preserving the SizeBudget type (use a type assertion or const typing if needed),
and if the budget can contain nested objects consider applying a shallow or deep
freeze accordingly to ensure runtime immutability.

In `@src/renderable/split.ts`:
- Around line 72-105: The budget check currently measures subLineCount from
renderFn(document(group.heading, group.sections)) but then writes
subFiles[subPath] from document(group.heading, subSections) (which may include a
backlink) and never remeasures the final parent, so produced subFiles or parent
can exceed budget; fix by building the exact sub-file document (use subSections
with optional createBackLink), render that final subRendered and recompute
subLineCount before deciding to extract, and after modifying parentSections
render the final parent via renderFn(document(...)) to verify parent is within
budget (if not, revert or inline the group); update logic around symbols
subSections, subFileName/subPath, subFiles, parentSections, parent and use
renderFn for final measurements.

In `@src/taxonomy/arch-values.ts`:
- Around line 1-9: The file-level JSDoc that currently contains the `@architect`
tag is missing the required `@libar-docs` opt-in marker; update the top comment
block in src/taxonomy/arch-values.ts (the header that references
registry-builder.ts and extracted-pattern.ts) to include `@libar-docs` alongside
`@architect` so the scanner picks up the file for docs processing.

In `@tests/features/doc-generation/index-codec.feature`:
- Around line 211-216: The scenario "Preamble appears after metadata and before
inventory" currently doesn't assert the preamble position because the table only
compares headed sections; update the scenario to make the preamble observable by
either (a) adding the preamble row to the "Then section ordering should be
correct:" table (e.g., include a column/row for the preamble paragraph) or (b)
add a new step after the existing steps—using the existing step phrase "When
decoding with a preamble section and document entries in topic \"Guides\""—that
directly asserts the preamble paragraph's block index (e.g., "Then the preamble
paragraph should be at index X" or similar), and ensure the step definition that
checks paragraph block index is used or implemented so the preamble position is
validated.
- Around line 1-6: Add the repo opt-in/source-planning metadata tags to this
feature by inserting the following annotations alongside the existing
`@architect-`* tags: `@libar-docs`, `@libar-docs-team` (set team name),
`@libar-docs-quarter` (set target quarter), `@libar-docs-depends-on` (list any
dependent feature(s)), and because the pattern is completed also add
`@libar-docs-unlock-reason` (provide the retroactive completion reason); ensure
these exact tag names are present at the top of the file to satisfy scanners and
the completed-pattern requirement.

In `@tests/steps/behavior/cli/process-api-reference.steps.ts`:
- Around line 88-96: Extract a test helper (e.g., createMockGeneratorContext)
that returns a fully populated GeneratorContext object (including baseDir,
outputDir, registry and the now-required masterDataset) instead of repeatedly
casting incomplete objects with "as GeneratorContext"; update the tests that
call generator.generate (the block creating `context` and calling
`generator.generate([], context)`) to use this helper so the context is
type-safe and future changes to GeneratorContext will fail at the helper, not
across four duplicated spots.

In `@tests/steps/doc-generation/index-codec.steps.ts`:
- Around line 504-513: The test step "Then('the Product Area Statistics table
should contain a progress bar'" currently only checks for '%' in the serialized
table and can miss the actual progress bar glyph; update the assertion to
inspect the actual table cell content from findTable(...) / tableBlock.rows (or
the specific progress cell) and assert that it contains the progress bar token
(e.g., '█') or matches the full expected rendered progress-cell format (percent
+ bar), rather than only checking for '%'. Ensure you use getSectionContent,
findTable, and tableBlock.rows to locate the specific cell and replace the
expect(tableContent).toContain('%') with an assertion for the glyph or full cell
pattern.
- Around line 57-68: The helper findSectionByHeading (and the other two heading
helpers in this file) currently use block.text.includes(headingText) which
allows partial matches; change these to perform an exact match (e.g., compare
trimmed text equality: block.text.trim() === headingText) and keep the existing
level check so only the exact heading text at the specified level is returned;
update findSectionByHeading and the other two helper functions to use this exact
equality comparison.

---

Outside diff comments:
In `@src/renderable/codecs/taxonomy.ts`:
- Around line 426-455: The defaults object in buildFormatTypesSection (and the
similar block at lines 681-749) hardcodes `@architect-`* examples; replace those
hardcoded fallback examples with values derived from the active tag
registry/config (use the existing taxonomy/tag-registry API in src/taxonomy/)
and only apply context.tagExampleOverrides as the final layer in getFormatInfo
so that description/example defaults come from the registry (e.g., category
names, allowed enum values, and tag prefixes) rather than literal strings;
update buildFormatTypesSection, the defaults variable, and getFormatInfo to
consult the tag registry helper functions/classes (instead of literal strings)
and keep overrides?.[format] as the last override step.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 62fdeeca-cb18-41eb-a193-f0268545382d

📥 Commits

Reviewing files that changed from the base of the PR and between e979c1a and f8a2dad.

⛔ Files ignored due to path filters (3)
  • docs-inbox/architectural-review-progressive-disclosure-and-codecs.md is excluded by none and included by none
  • docs-inbox/codebase-exploration-findings.md is excluded by none and included by none
  • docs-inbox/refactoring-execution-guide.md is excluded by none and included by none
📒 Files selected for processing (68)
  • src/cli/validate-patterns.ts
  • src/config/project-config-schema.ts
  • src/config/project-config.ts
  • src/config/resolve-config.ts
  • src/extractor/doc-extractor.ts
  • src/extractor/dual-source-extractor.ts
  • src/extractor/gherkin-extractor.ts
  • src/extractor/index.ts
  • src/generators/built-in/design-review-generator.ts
  • src/generators/built-in/reference-generators.ts
  • src/generators/codec-based.ts
  • src/generators/orchestrator.ts
  • src/generators/pipeline/build-pipeline.ts
  • src/generators/pipeline/transform-dataset.ts
  • src/generators/types.ts
  • src/renderable/codecs/adr.ts
  • src/renderable/codecs/architecture.ts
  • src/renderable/codecs/business-rules.ts
  • src/renderable/codecs/claude-module.ts
  • src/renderable/codecs/codec-registry.ts
  • src/renderable/codecs/composite.ts
  • src/renderable/codecs/design-review.ts
  • src/renderable/codecs/index-codec.ts
  • src/renderable/codecs/index.ts
  • src/renderable/codecs/patterns.ts
  • src/renderable/codecs/planning.ts
  • src/renderable/codecs/pr-changes.ts
  • src/renderable/codecs/product-area-metadata.ts
  • src/renderable/codecs/reference-builders.ts
  • src/renderable/codecs/reference-diagrams.ts
  • src/renderable/codecs/reference-types.ts
  • src/renderable/codecs/reference.ts
  • src/renderable/codecs/reporting.ts
  • src/renderable/codecs/requirements.ts
  • src/renderable/codecs/session.ts
  • src/renderable/codecs/taxonomy.ts
  • src/renderable/codecs/timeline.ts
  • src/renderable/codecs/types/base.ts
  • src/renderable/codecs/validation-rules.ts
  • src/renderable/generate.ts
  • src/renderable/index.ts
  • src/renderable/render-options.ts
  • src/renderable/render.ts
  • src/renderable/split.ts
  • src/scanner/ast-parser.ts
  • src/taxonomy/arch-values.ts
  • src/taxonomy/registry-builder.ts
  • src/validation-schemas/doc-directive.ts
  • src/validation-schemas/dual-source.ts
  • src/validation-schemas/extracted-pattern.ts
  • src/validation-schemas/master-dataset.ts
  • tests/features/behavior/pattern-tag-extraction.feature
  • tests/features/behavior/transform-dataset.feature
  • tests/features/config/config-resolution.feature
  • tests/features/doc-generation/index-codec.feature
  • tests/features/generators/codec-based.feature
  • tests/features/types/tag-registry-builder.feature
  • tests/fixtures/pattern-factories.ts
  • tests/fixtures/scanner-fixtures.ts
  • tests/steps/behavior/cli/process-api-reference.steps.ts
  • tests/steps/behavior/codecs/reporting-codecs.steps.ts
  • tests/steps/behavior/description-quality-foundation.steps.ts
  • tests/steps/behavior/pattern-tag-extraction.steps.ts
  • tests/steps/behavior/transform-dataset.steps.ts
  • tests/steps/config/config-resolution.steps.ts
  • tests/steps/doc-generation/index-codec.steps.ts
  • tests/steps/generators/codec-based.steps.ts
  • tests/support/helpers/assertions.ts
💤 Files with no reviewable changes (11)
  • src/extractor/index.ts
  • tests/features/types/tag-registry-builder.feature
  • src/generators/built-in/reference-generators.ts
  • src/extractor/dual-source-extractor.ts
  • src/generators/built-in/design-review-generator.ts
  • src/validation-schemas/dual-source.ts
  • tests/fixtures/scanner-fixtures.ts
  • tests/fixtures/pattern-factories.ts
  • tests/support/helpers/assertions.ts
  • src/validation-schemas/extracted-pattern.ts
  • src/validation-schemas/doc-directive.ts

- Remove CodecBasedGenerator masterDataset runtime guard (trust required type,
  consistent with 4 other generators and 6 guards already removed)
- Remove dead `brief` type field from gherkin-ast-parser return type
- Fix stale JSDoc in registry-builder (brief removed from core group)
- Add `removed-tag` anti-pattern rule (error severity) that detects
  tags removed from registry but still present in feature files
- Update bySource → bySourceType in CLAUDE.md source module and ARCHITECTURE.md
- Regenerate all docs-live/ output
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/config/project-config-schema.ts (1)

155-157: ⚠️ Potential issue | 🟠 Major

z.record with enum keys requires all keys in Zod 4 — use z.partialRecord for partial overrides.

This issue was previously identified. Per the Zod 4 library documentation: "z.record() now requires both key and value schemas. Supports enums as keys and enforces completeness. New z.partialRecord() added for optional enum keys."

The current schema will reject configs that override only a subset of format types (e.g., { value: { example: '...' } }), which contradicts the intended partial override behavior shown in the downstream consumer (src/renderable/codecs/taxonomy.ts).

Suggested fix
 const TagExampleOverridesSchema = z
-  .record(z.enum(FORMAT_TYPES), TagExampleOverrideSchema.optional())
+  .partialRecord(z.enum(FORMAT_TYPES), TagExampleOverrideSchema)
   .optional();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/project-config-schema.ts` around lines 155 - 157, The
TagExampleOverridesSchema currently uses z.record(z.enum(FORMAT_TYPES),
TagExampleOverrideSchema.optional()) which in Zod 4 requires all enum keys;
change it to use z.partialRecord(z.enum(FORMAT_TYPES), TagExampleOverrideSchema)
(and keep the overall .optional() if desired) so that partial overrides (e.g.,
only some FORMAT_TYPES keys) are allowed; update the definition referencing
TagExampleOverridesSchema, FORMAT_TYPES, and TagExampleOverrideSchema
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/config/project-config-schema.ts`:
- Around line 155-157: The TagExampleOverridesSchema currently uses
z.record(z.enum(FORMAT_TYPES), TagExampleOverrideSchema.optional()) which in Zod
4 requires all enum keys; change it to use z.partialRecord(z.enum(FORMAT_TYPES),
TagExampleOverrideSchema) (and keep the overall .optional() if desired) so that
partial overrides (e.g., only some FORMAT_TYPES keys) are allowed; update the
definition referencing TagExampleOverridesSchema, FORMAT_TYPES, and
TagExampleOverrideSchema accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 19533136-6da1-4b70-b8b3-f89e674c1a0d

📥 Commits

Reviewing files that changed from the base of the PR and between f8a2dad and e59f443.

📒 Files selected for processing (2)
  • src/config/project-config-schema.ts
  • src/generators/codec-based.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/validation/anti-patterns.ts (1)

422-434: 🧹 Nitpick | 🔵 Trivial

Source mapping is correct for current scope, but consider future-proofing

The hardcoded source mapping ('process-in-code' → 'typescript', else 'gherkin') works correctly for the current detectRemovedTags implementation since it only scans feature files. However, if detectRemovedTags is extended to also scan TypeScript files (as suggested above), this mapping would need to be updated.

Consider making the source explicit in the violation itself or using a more extensible mapping approach to avoid future maintenance issues.

Optional: explicit source mapping
+const SOURCE_BY_ANTI_PATTERN: Record<AntiPatternId, 'typescript' | 'gherkin' | 'cross-source'> = {
+  'tag-duplication': 'cross-source',
+  'process-in-code': 'typescript',
+  'removed-tag': 'gherkin', // Update if TS scanning is added
+  'magic-comments': 'gherkin',
+  'scenario-bloat': 'gherkin',
+  'mega-feature': 'gherkin',
+};

 export function toValidationIssues(violations: readonly AntiPatternViolation[]): Array<{...}> {
   return violations.map((v) => ({
     severity: v.severity,
     message: `[${v.id}] ${v.message}`,
-    source: v.id === 'process-in-code' ? ('typescript' as const) : ('gherkin' as const),
+    source: SOURCE_BY_ANTI_PATTERN[v.id],
     file: v.file,
   }));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/validation/anti-patterns.ts` around lines 422 - 434, The current
toValidationIssues function hardcodes source based on v.id which will break if
detectRemovedTags or other detectors start producing violations from TypeScript;
modify toValidationIssues to prefer an explicit source property on the
AntiPatternViolation (e.g., v.source) if present and fall back to the existing
id-based mapping (keep v.id === 'process-in-code' → 'typescript', else
'gherkin') so future detectors can set v.source directly; update the
AntiPatternViolation type (or its generator) to allow an optional source:
'typescript' | 'gherkin' | 'cross-source' and adjust detectRemovedTags (and any
other callers) to populate that field when appropriate.
src/taxonomy/registry-builder.ts (1)

110-123: ⚠️ Potential issue | 🟡 Minor

Update hardcoded filter in taxonomy.ts to match registry changes

The removal of 'core' from METADATA_TAGS_BY_GROUP.core in registry-builder.ts creates a mismatch with the hardcoded filter in src/renderable/codecs/taxonomy.ts:351:

if (['pattern', 'status', 'core', 'usecase'].includes(tag.tag)) {

This filter still references 'core', which no longer exists in the tag registry. Update the filter to ['pattern', 'status', 'usecase'], or better yet, reference METADATA_TAGS_BY_GROUP.core directly to prevent future drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/taxonomy/registry-builder.ts` around lines 110 - 123, The hardcoded tag
filter in src/renderable/codecs/taxonomy.ts (the if check that currently does if
(['pattern','status','core','usecase'].includes(tag.tag))) is out of sync with
METADATA_TAGS_BY_GROUP.core in registry-builder.ts which no longer contains
'core'; update the filter to use the canonical source instead of hardcoding:
import or reference METADATA_TAGS_BY_GROUP and replace the array literal with
METADATA_TAGS_BY_GROUP.core (or explicitly with ['pattern','status','usecase']
if import not possible) so the check uses the registry-driven list and prevents
future drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/validation/anti-patterns.ts`:
- Around line 149-152: detectRemovedTags currently only inspects
ScannedGherkinFile[] and thus misses tags present in TypeScript JSDoc
(ScannedFile) that were removed from feature files; extend the check to scan
ScannedFile[] (or add an overload) and parse JSDoc comments for tags registered
in TagRegistry, emitting AntiPatternViolation for tags found in TS files but
absent from features. Use the existing detectRemovedTags function as the entry
point, reuse TagRegistry lookups to identify relevant tags, mirror the
scanning/parsing approach used by detectProcessInCode for locating tags in
TypeScript/scanned files, and return violations in the same shape so callers of
detectRemovedTags continue to work.

---

Outside diff comments:
In `@src/taxonomy/registry-builder.ts`:
- Around line 110-123: The hardcoded tag filter in
src/renderable/codecs/taxonomy.ts (the if check that currently does if
(['pattern','status','core','usecase'].includes(tag.tag))) is out of sync with
METADATA_TAGS_BY_GROUP.core in registry-builder.ts which no longer contains
'core'; update the filter to use the canonical source instead of hardcoding:
import or reference METADATA_TAGS_BY_GROUP and replace the array literal with
METADATA_TAGS_BY_GROUP.core (or explicitly with ['pattern','status','usecase']
if import not possible) so the check uses the registry-driven list and prevents
future drift.

In `@src/validation/anti-patterns.ts`:
- Around line 422-434: The current toValidationIssues function hardcodes source
based on v.id which will break if detectRemovedTags or other detectors start
producing violations from TypeScript; modify toValidationIssues to prefer an
explicit source property on the AntiPatternViolation (e.g., v.source) if present
and fall back to the existing id-based mapping (keep v.id === 'process-in-code'
→ 'typescript', else 'gherkin') so future detectors can set v.source directly;
update the AntiPatternViolation type (or its generator) to allow an optional
source: 'typescript' | 'gherkin' | 'cross-source' and adjust detectRemovedTags
(and any other callers) to populate that field when appropriate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b9c57520-aa05-48d6-8cac-07e43f39dce4

📥 Commits

Reviewing files that changed from the base of the PR and between e59f443 and 8c3772d.

⛔ Files ignored due to path filters (25)
  • _claude-md/testing/test-implementation.md is excluded by none and included by none
  • docs-live/ARCHITECTURE.md is excluded by none and included by none
  • docs-live/BUSINESS-RULES.md is excluded by none and included by none
  • docs-live/CHANGELOG-GENERATED.md is excluded by none and included by none
  • docs-live/INDEX.md is excluded by none and included by none
  • docs-live/PRODUCT-AREAS.md is excluded by none and included by none
  • docs-live/TAXONOMY.md is excluded by none and included by none
  • docs-live/_claude-md/annotation/annotation-overview.md is excluded by none and included by none
  • docs-live/_claude-md/architecture/reference-sample.md is excluded by none and included by none
  • docs-live/_claude-md/core-types/core-types-overview.md is excluded by none and included by none
  • docs-live/_claude-md/validation/validation-overview.md is excluded by none and included by none
  • docs-live/business-rules/annotation.md is excluded by none and included by none
  • docs-live/business-rules/configuration.md is excluded by none and included by none
  • docs-live/business-rules/data-api.md is excluded by none and included by none
  • docs-live/business-rules/generation.md is excluded by none and included by none
  • docs-live/product-areas/ANNOTATION.md is excluded by none and included by none
  • docs-live/product-areas/CONFIGURATION.md is excluded by none and included by none
  • docs-live/product-areas/CORE-TYPES.md is excluded by none and included by none
  • docs-live/product-areas/DATA-API.md is excluded by none and included by none
  • docs-live/product-areas/GENERATION.md is excluded by none and included by none
  • docs-live/product-areas/VALIDATION.md is excluded by none and included by none
  • docs-live/reference/ARCHITECTURE-TYPES.md is excluded by none and included by none
  • docs-live/reference/REFERENCE-SAMPLE.md is excluded by none and included by none
  • docs-live/taxonomy/format-types.md is excluded by none and included by none
  • docs-live/taxonomy/metadata-tags.md is excluded by none and included by none
📒 Files selected for processing (6)
  • docs/ARCHITECTURE.md
  • src/generators/codec-based.ts
  • src/scanner/gherkin-ast-parser.ts
  • src/taxonomy/registry-builder.ts
  • src/validation/anti-patterns.ts
  • src/validation/types.ts
💤 Files with no reviewable changes (1)
  • src/scanner/gherkin-ast-parser.ts

…, test precision

- fullDocsPath default 'docs/' → 'docs-live/' to match output directory
- PRODUCT_AREA_ARCH_CONTEXT_MAP and PRODUCT_AREA_META now use ProductAreaKey
  type instead of Record<string, ...> for compile-time key safety
- Move misplaced imports to top of file in session.ts and timeline.ts
- Test helpers use exact heading match (===) instead of includes()
- Progress bar test now asserts █ glyph presence, not just %
- Preamble position test verifies paragraph index between headings
- Extract createMockGeneratorContext() helper (4x duplication removed)
@darko-mijic darko-mijic merged commit 0bf7f92 into main Apr 1, 2026
4 checks passed
@darko-mijic darko-mijic deleted the refactor/progressive-disclosure-codec-pipeline branch April 1, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant