Skip to content

test: refresh product os reader quality reference#342

Merged
Stahl-G merged 1 commit into
mainfrom
codex/v0113-reference-run-refresh
Jul 2, 2026
Merged

test: refresh product os reader quality reference#342
Stahl-G merged 1 commit into
mainfrom
codex/v0113-reference-run-refresh

Conversation

@Stahl-G

@Stahl-G Stahl-G commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Summary

  • extend the packaged same_evidence_reader_quality_regression case to generate clean delivery/audit bundle archives through a new eval-only packs.bundle action
  • add source appendix and source appendix trace fixture artifacts plus finalize audit/delivery hash bindings
  • add a public-safe v0.11.3 Product OS reader-quality reference note and link it from README/docs indexes

Boundary

  • reference package / deterministic regression only
  • no new runtime warning surface, projection artifact, gate authority, UI, semantic proof, output-quality score, delivery approval, or release authorization
  • fixture remains synthetic public-safe data

Validation

  • python3 -m pytest -q tests/test_evaluation_cases.py tests/test_report_bundles.py
  • PYTHONPATH=src python3 -m multi_agent_brief.cli.main eval-cases validate --json
  • PYTHONPATH=src python3 -m multi_agent_brief.cli.main eval-cases run --case-id same_evidence_reader_quality_regression --repo-workdir . --json
  • python3 scripts/check_public_safety.py --path docs/reference-runs/v0.11.3-product-os-reader-quality-reference.md
  • python3 scripts/check_public_safety.py --path src/multi_agent_brief/evaluation_cases/fixtures/cases/same_evidence_reader_quality_regression
  • python3 scripts/check_product_baseline.py
  • python3 scripts/check_release_consistency.py --no-tag
  • python3 scripts/check_briefloop_skill_freshness.py
  • python3 scripts/check_skill_contract.py
  • python3 scripts/check_version_consistency.py
  • PYTHONPATH=src python3 scripts/check_capabilities.py
  • python3 scripts/check_runtime_asset_parity.py
  • python3 scripts/generate_agent_configs.py --check
  • python3 scripts/sync_hermes_plugin_skills.py --check
  • git diff --check
  • python3 -m compileall -q src tests
  • non-editable install smoke: eval-cases validate + same_evidence_reader_quality_regression
  • python3 -m pytest -q

@Stahl-G Stahl-G force-pushed the codex/v0113-reference-run-refresh branch from 4ce77e0 to 582ff99 Compare July 2, 2026 05:11
@Stahl-G Stahl-G marked this pull request as ready for review July 2, 2026 05:19
@Stahl-G Stahl-G merged commit 3c8a3e0 into main Jul 2, 2026
13 checks passed
@Stahl-G Stahl-G deleted the codex/v0113-reference-run-refresh branch July 2, 2026 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant