Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .specallowlist
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Documentation
*.md
context/technical-notes/*.md
LICENSE

# Python package
Expand Down
3 changes: 2 additions & 1 deletion .specignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,5 @@ specs/architecture/solutions/SOL-001.md
# Implementation Design examples
specs/design/IMP-001.md

# Note: TN-001 moved to docs/analysis/ as it's an analysis document rather than a specification
# Technical notes are validated but don't require test coverage
context/technical-notes/*.md
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Implemented Issue #42: improved reference resolution and flexible file classification (typed / unmanaged / external / excluded), added `.specignore` support, and CLI flags for strict validation and external URL checking (#44)

### Docs
- Investigation and solution notes for Issue #42 added to `technical-notes/issue-42-investigation.md` (#43)
- Investigation and solution notes for Issue #42 added to `context/technical-notes/TN-004.md` (#43)

### Changed
- Reference resolution flow updated to support the new classification system and fallbacks for unmanaged files; added tests and integration scenarios for mixed typed/unmanaged content (#44)
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
# Technical Note: LLM Provider Selection for Semantic Test Adherence
# TN-002: LLM Provider Selection for Semantic Test Adherence

**Type**: Technical Note
**Date**: 2025-10-23
**Author**: Claude (AI Assistant)
**Status**: Draft
**Related**: Milestone 001 - Semantic Test Adherence (SPEC-003)

## Executive Summary
## Abstract

This technical note evaluates LLM provider options and integration libraries for implementing the semantic test adherence checker (SPEC-003). The analysis focuses on:

Expand All @@ -20,9 +21,15 @@ This technical note evaluates LLM provider options and integration libraries for

This approach provides maximum flexibility, built-in retry/fallback logic, and a unified API surface while supporting both cost-effective CI/CD and enterprise deployment scenarios.

## Requirements Context
## Background

From SPEC-003, the semantic test adherence checker must:
### Context

The semantic test adherence checker (SPEC-003) requires LLM capabilities to analyze requirement-test pairs and verify semantic alignment. The system must operate effectively in multiple contexts: cost-free GitHub Actions CI/CD, local development, and production enterprise deployments.

### Requirements from SPEC-003

The semantic test adherence checker must satisfy the following requirements:

- **REQ-021**: Support multiple AI/LLM backends
- **REQ-022**: Support Anthropic Claude (via API), OpenAI GPT, and local models (Ollama)
Expand All @@ -32,6 +39,8 @@ From SPEC-003, the semantic test adherence checker must:
- **REQ-041**: Support concurrent LLM requests (default: 5)
- **REQ-042**: Cache results to avoid re-analyzing unchanged code

### Key Constraints

The tool will analyze requirement-test pairs to verify semantic alignment, requiring:
- Robust error handling for API failures
- Cost-effective operation in CI/CD (GitHub Actions)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,37 @@
# Type Definition Schema Design
# TN-003: Type Definition Schema Design

**Type**: Technical Note
**Date**: 2025-10-25
**Status**: Implementation
**Related Spec**: SPEC-004
**Author**: Claude
**Status**: Published
**Related**: SPEC-004

## Abstract

This document defines the YAML schema for DSL type definitions, incorporating ID-based reference resolution where links are treated as typed references to module and class instances. The schema provides a declarative approach to specifying document structure, validation rules, and cross-document relationships for markdown-based specifications.

**Key concepts**: Module and class type definitions, ID-based linking, reference semantics, content validators, and validation strategies.

## Background

### Context

The spec-check project implements a DSL (Domain-Specific Language) for defining and validating structured markdown specifications. To support this, we need a schema format that can express:
- Document structure requirements (sections, headings, content types)
- Identifier patterns and uniqueness constraints
- Cross-document reference types and cardinality rules
- Content validation rules (Gherkin, EARS, custom formats)

### Motivation

YAML was chosen as the schema definition format because it:
- Is human-readable and writable
- Supports complex nested structures
- Has wide tool support
- Can be version-controlled effectively
- Allows comments for documentation

The ID-based reference system enables location-independent linking, where documents can reference each other by stable identifiers rather than file paths, providing better refactoring support and semantic clarity.

## Overview

Expand Down Expand Up @@ -703,3 +732,34 @@ The system should provide a migration tool:
spec-check migrate-to-dsl --analyze # Report what needs changing
spec-check migrate-to-dsl --convert # Auto-convert where possible
```

## Conclusion

This type definition schema design provides a comprehensive foundation for expressing specification document structures in YAML format. The schema supports:

**Core Capabilities**:
- **Module definitions** for document-level types (Requirements, Contracts, ADRs, etc.)
- **Class definitions** for section-level patterns (Test Cases, Acceptance Criteria, etc.)
- **ID-based reference system** enabling location-independent linking
- **Content validators** for domain-specific syntax (Gherkin, EARS, etc.)
- **Cardinality constraints** for relationship validation
- **Flexible scope rules** (global, directory, module, section-scoped IDs)

**Key Design Decisions**:
1. **Declarative YAML format** for easy authoring and version control
2. **Separation of modules and classes** for reusability
3. **ID-based linking** rather than file paths for stability
4. **Pluggable content validators** for extensibility
5. **Multi-strategy resolution** supporting different project organizations

**Implementation Status**:
The Python implementation in `spec_check/dsl/models.py` provides Pydantic-based models corresponding to this schema design, enabling type-safe schema definitions with excellent IDE support and validation.

**Next Steps**:
- YAML schema loader implementation
- Migration tooling for existing specifications
- Additional content validators (table structure, RFC 2119 keywords)
- Schema visualization tools
- Template generation from schemas

This schema design balances expressiveness with simplicity, enabling rich specification structures while keeping type definitions readable and maintainable.
Original file line number Diff line number Diff line change
@@ -1,8 +1,28 @@
# Issue #42 Investigation: validate-dsl Reference Resolution Failures
# TN-004: Issue #42 Investigation - validate-dsl Reference Resolution Failures

**Issue**: validate-dsl fails to resolve references between validated documents
**Type**: Technical Note
**Date**: 2025-10-30
**Status**: Root cause identified, solution proposed
**Author**: Claude
**Status**: Published
**Related**: Issue #42

## Abstract

Investigation into `validate-dsl` command failures when resolving cross-directory file path references. The issue manifested as ~548 reference resolution failures in a typical project, with the command unable to resolve references between validated documents even when both source and target files were present.

**Root Cause**: Files are only resolvable if registered in the ID registry, and registration only occurs when files match a defined type definition. Documents that don't match any type are never registered and cannot be referenced.

**Solution**: Implement support for "unmanaged" documents (files without type definitions) that can still be referenced by file path, with clear distinction between typed and untyped document references.

## Background

### Context

The `validate-dsl` command performs cross-document reference validation as part of the spec-check validation pipeline. It was designed to validate references between strongly-typed specification documents (Requirements, ADRs, Jobs, etc.) using ID-based linking.

### Problem Discovery

Users reported that `validate-dsl` failed to resolve references in real-world repositories containing mixed content: typed specifications alongside general documentation, design notes, roadmaps, and other markdown files. The validator would fail with "Module reference not found" errors for perfectly valid file path references.

## Problem Statement

Expand Down Expand Up @@ -993,3 +1013,33 @@ References validated: 35
- PR #34: Auto-ignore VCS directories in linter
- PR #37: Support file path references in validate-dsl
- `spec_check/linter.py:221`: Existing VCS directory exclusion pattern

## Conclusion

**Root Cause Identified**: The validate-dsl reference resolution failures stem from an architectural assumption that all validated documents must match a type definition. Files without matching types are never registered in the ID registry and cannot be resolved as reference targets.

**Recommended Solution**:
1. **Support unmanaged documents** - Files that don't match type definitions can still be validated and referenced by file path
2. **Two-tier resolution strategy**:
- Typed documents: Use ID-based references with full type checking
- Unmanaged documents: Use file path references with existence checking only
3. **Auto-exclusion of VCS directories** - Automatically ignore `.git/`, `.claude/`, `.venv/`, etc. to reduce noise
4. **Clear reporting** - Distinguish between typed and unmanaged documents in validation output

**Implementation Priority**: High - This blocking issue prevents validate-dsl from working in real-world repositories with mixed content.

**Design Principles**:
- **Low friction**: Repositories with general docs should validate without additional configuration
- **Gradual typing**: Teams can incrementally add type definitions without breaking existing references
- **Clear semantics**: Typed vs untyped references have different validation rules and capabilities
- **Backward compatibility**: Existing fully-typed repositories continue to work identically

**Next Steps**:
1. Implement `UnmanagedDocumentType` class for files without type definitions
2. Add VCS directory auto-exclusion (reuse linter patterns)
3. Update reference resolver to handle both typed and file-path references
4. Enhance validation reporting with type breakdowns
5. Add `--strict` mode for teams that want explicit classification

This investigation demonstrates the importance of designing validation tools that accommodate real-world usage patterns rather than idealized scenarios. The solution balances strong typing for specifications with pragmatic handling of mixed-content repositories.

Original file line number Diff line number Diff line change
@@ -1,53 +1,41 @@
# Specification DSL Design Document
# TN-005: Specification DSL Design

**Type**: Technical Note
**Date**: 2025-10-24
**Status**: Proposal
**Related Spec**: SPEC-004
**Author**: Claude
**Status**: Published
**Related**: SPEC-004

## Overview
## Abstract

This document describes the design of a domain-specific language (DSL) for defining and validating structured markdown specifications. The system provides a schema layer for markdown documents, enabling machine-enforceable structure while preserving the human-readability and standard tooling compatibility of markdown.

## Problem Statement

Organizations maintain technical specifications, requirements, contracts, and architectural documents in markdown format. These documents follow implicit conventions around structure, naming, cross-references, and content patterns. Currently, these conventions exist only in documentation and tribal knowledge, leading to inconsistency and manual review overhead.

The goal is to formalize these conventions as explicit, machine-validated type definitions while keeping the actual specification documents as standard markdown that renders correctly in any markdown viewer.
**Key Innovation**: A two-layer architecture where YAML type definitions describe document schemas, and standard markdown files conform to these schemas while remaining fully compatible with existing markdown parsers and viewers.

## Core Concept
**Core Concepts**: Modules (document-level types), Classes (section-level patterns), References (typed relationships), Identifiers (stable unique designators), and Content Validators (domain-specific syntax checkers).

The system operates as a two-layer architecture:
## Background

**Type Definition Layer** - YAML documents that define schemas for specification document types. These schemas describe file naming patterns, section structure, required fields, identifier patterns, and allowed cross-references.
### Problem Statement

**Document Layer** - Standard markdown files that conform to the defined types. These remain fully compatible with standard markdown parsers and viewers, but can be validated against their type definitions for correctness.
Organizations maintain technical specifications, requirements, contracts, and architectural documents in markdown format. These documents follow implicit conventions around structure, naming, cross-references, and content patterns. Currently, these conventions exist only in documentation and tribal knowledge, leading to:
- Inconsistency across specifications
- Manual review overhead
- Broken references that go undetected
- Difficulty enforcing organizational standards

This separation ensures that specifications remain human-readable first-class markdown while gaining the benefits of machine-enforced structure.
### Design Goal

## Fundamental Abstractions
Formalize these conventions as explicit, machine-validated type definitions while keeping the actual specification documents as standard markdown that renders correctly in any markdown viewer. This separation ensures that specifications remain human-readable first-class markdown while gaining the benefits of machine-enforced structure.

### Modules
### System Overview

A module is a file-level type definition. It describes an entire specification document class such as a requirement, contract, or architecture decision record. Modules define:

- File naming patterns and allowed locations in the repository
- The relationship between filename conventions and document titles
- Identifier patterns and where they appear in the document
- Required and optional section structure
- Allowed cross-references to other module types
- Cardinality constraints for relationships

**Example Module Use Cases:**
- A "Requirement" module that enforces EARS format with required sections
- A "Contract" module that validates legal document structure
- An "ADR" module that ensures architectural decisions follow a template
The system operates as a two-layer architecture:

### Classes
**Type Definition Layer** - YAML documents that define schemas for specification document types. These schemas describe file naming patterns, section structure, required fields, identifier patterns, and allowed cross-references.

A class is a section-level type definition. It describes a repeatable structural pattern within a document, such as an acceptance criterion, a contract clause, or a risk assessment. Classes define:
**Document Layer** - Standard markdown files that conform to the defined types. These remain fully compatible with standard markdown parsers and viewers, but can be validated against their type definitions for correctness.

- Heading patterns and nesting levels
- Required fields within the section
- Content validation rules
- Nested sub-structures

Expand Down
3 changes: 3 additions & 0 deletions spec_check/dsl/builtin_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
PrinciplesModule,
RequirementModule,
SpecificationModule,
TechnicalNoteModule,
)

# ============================================================================
Expand All @@ -32,6 +33,7 @@
"ADR": ArchitectureDecisionModule(),
"Specification": SpecificationModule(),
"Principles": PrinciplesModule(),
"TechnicalNote": TechnicalNoteModule(),
}

# Export built-in class types
Expand All @@ -47,5 +49,6 @@
"ArchitectureDecisionModule",
"SpecificationModule",
"PrinciplesModule",
"TechnicalNoteModule",
"AcceptanceCriterion",
]
4 changes: 2 additions & 2 deletions spec_check/dsl/layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -586,7 +586,7 @@ class TechnicalNoteModule(SpecModule):
are not themselves requirements or implementations.

Example filename: TN-001.md
Location: specs/notes/
Location: context/technical-notes/

Required sections:
- Abstract: Summary of the note
Expand All @@ -602,7 +602,7 @@ class TechnicalNoteModule(SpecModule):
description: str = "Technical note or analysis document"

file_pattern: str = r"^TN-\d{3}\.md$"
location_pattern: str = r"specs/notes/"
location_pattern: str = r"context/technical-notes/"

identifier: IdentifierSpec = IdentifierSpec(
pattern=r"TN-\d{3}",
Expand Down
2 changes: 1 addition & 1 deletion specs/future/specification-dsl.md
Original file line number Diff line number Diff line change
Expand Up @@ -291,4 +291,4 @@ This feature will be considered successful when:

## Related Documents

See `technical-notes/specification-dsl-design.md` for detailed design discussion and implementation considerations.
See `context/technical-notes/TN-005.md` for detailed design discussion and implementation considerations.
8 changes: 4 additions & 4 deletions tests/test_new_document_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -330,9 +330,9 @@ def test_technical_note_file_pattern(self):
def test_technical_note_location_pattern(self):
"""Test that Technical Note module matches correct location."""
module = TechnicalNoteModule()
assert module.location_pattern == r"specs/notes/"
assert module.location_pattern == r"context/technical-notes/"

test_path = Path("specs/notes/TN-001.md")
test_path = Path("context/technical-notes/TN-001.md")
assert module.matches_file(test_path)

def test_technical_note_identifier_spec(self):
Expand Down Expand Up @@ -426,7 +426,7 @@ def test_implementation_design_document_validates(self, tmp_path):

def test_technical_note_document_validates(self, tmp_path):
"""Test that TN-001.md validates against Technical Note schema."""
tn_001 = Path("specs/notes/TN-001.md")
tn_001 = Path("context/technical-notes/TN-001.md")
if not tn_001.exists():
pytest.skip("TN-001.md not found")

Expand All @@ -450,7 +450,7 @@ def test_technical_note_document_validates(self, tmp_path):
"SOL-001",
),
("ImplementationDesign", r"^IMP-\d{3}\.md$", "specs/design/", "IMP-001"),
("TechnicalNote", r"^TN-\d{3}\.md$", "specs/notes/", "TN-001"),
("TechnicalNote", r"^TN-\d{3}\.md$", "context/technical-notes/", "TN-001"),
],
)
class TestDocumentTypePatterns:
Expand Down
Loading