feat: durability hardening — externalize model config, add eval net, single-source the system prompt by Cartooli · Pull Request #2 · Cartooli/edustack

Cartooli · 2026-06-05T14:26:29Z

Summary

Addresses the three feasible-to-fix issues from the durability review (score 30/56, HIGH RISK), targeting ~40/56 MODERATE — without touching the deterministic safety pipeline or stateless design that already scored well.

The shape was "airtight execution layer, no instruments, hardwired engine." This PR builds the instruments and unbolts the engine:

Externalize model config (feat(config)) — model, temperature, max_tokens were hardcoded in ai-tutor.sh business logic. Now read from the already-loaded config/teacher-settings.json with validated ranges and safe defaults equal to the historical values. Missing/malformed config falls back cleanly. Ships config/teacher-settings.example.json. Fixes the swap test — change a model in one config value.
Eval/regression harness (feat(tests)) — there were no tests. Added tests/ covering every injection pattern, every blocklist category, PII, leet-speak evasion, and clean cases for content-filter.sh (39 assertions) + input-sanitizer.sh (6), wired into CI. Plus an API-key-gated tutor golden-set that runs live locally and is skipped (not failed) in CI. Fixes the silence test — a weakened filter or bad model swap now breaks CI.
Single-source the system prompt (feat(safety)) — the prompt was triplicated across ai-tutor.sh, CLAUDE.md, and PROMPT-BUILDER.md and had already drifted (different banned-category lists). Moved to skills/safety/system-prompt.txt, loaded fail-closed (missing/empty → no API call), with a new safety-check.sh check 9 that verifies coverage of every CLAUDE.md banned category. Fixes fragment coherence.
Cleanup (chore(safety)) — removed the dead EVASION_FILE variable; documented that evasion normalization is intentionally in-code and now test-covered.

Key decisions

All new external dependencies (config values, prompt file) degrade in the safe direction.
No new runtime dependencies — pure bash + existing python3.
With no config file present (the current real state), runtime behavior is unchanged.

Testing

bash tests/run-tests.sh → 45 deterministic assertions pass; golden-set skips without a key.
./scripts/safety-check.sh → all 9 checks pass.
bash -n clean on all scripts; workflow YAML validates.
Config parsing verified: valid overrides applied, out-of-range values fall back to defaults.

Post-Deploy Monitoring & Validation

What to monitor/search
- Logs: logs/audit-*.log for new error codes config_parse_failed, prompt_file_missing, prompt_file_empty.
- CI: the new "Run tests" step in the EduStack Safety Check workflow.
Validation checks (commands)
- bash ./tests/run-tests.sh (offline)
- ./scripts/safety-check.sh
- With a key + ai_enabled: true: bash ./tests/test-tutor-golden.sh before/after any model change.
Expected healthy behavior: tutor responses unchanged with no config file; model swap via config alone passes the golden-set.
Failure signal / rollback trigger: prompt_file_missing in audit logs (prompt file not deployed) → tutor fails closed to the friendly fallback; restore skills/safety/system-prompt.txt. Revert the branch if CI tests regress.
Validation window & owner: first classroom session after deploy; repo owner.

🤖 Generated with Claude Opus 4.8 (1M context) via Claude Code

…tings.json Model params were hardcoded in ai-tutor.sh business logic, making model swaps require code edits in two places with no fail-safe. Read them from the already-loaded config file with validated ranges and safe defaults equal to the historical values; missing/malformed config falls back cleanly. Ship teacher-settings.example.json and update PROMPT-BUILDER.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The system prompt was triplicated across ai-tutor.sh, CLAUDE.md, and PROMPT-BUILDER.md, and had already drifted (different banned-category lists). Move it to skills/safety/system-prompt.txt as the single source, load it fail-closed in ai-tutor.sh (missing/empty -> no API call), and add safety-check.sh check 9 verifying it covers every CLAUDE.md banned category. Docs now reference the file instead of restating it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

No tests existed for the safety pipeline, so a model swap or a weakened filter would go undetected (the 'silence test'). Add tests/ covering content-filter.sh (every injection pattern + every blocklist category + PII + clean cases, 36 assertions) and input-sanitizer.sh (6 assertions), plus an API-key-gated tutor golden-set that runs live only when a key is present and is skipped (not failed) in CI. Wire run-tests.sh into the safety-check workflow. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

EVASION_FILE was defined but never read; the leet-speak/spacing normalization is intentionally hardcoded in normalize(). Replace the dead var with a clarifying comment, mark evasion-patterns.txt as documentation only, and add leet-speak regression tests so the normalization is covered. Mark plan completed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Cartooli and others added 4 commits June 5, 2026 09:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: durability hardening — externalize model config, add eval net, single-source the system prompt#2

feat: durability hardening — externalize model config, add eval net, single-source the system prompt#2
Cartooli wants to merge 4 commits into
mainfrom
feat/durability-hardening

Cartooli commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Cartooli commented Jun 5, 2026

Summary

Key decisions

Testing

Post-Deploy Monitoring & Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant