Skip to content

feat(issue-20): S2 Wiki-backed Report Memory#23

Open
LucasErcolano wants to merge 3 commits into
mainfrom
issue-20-s2-memory-feature
Open

feat(issue-20): S2 Wiki-backed Report Memory#23
LucasErcolano wants to merge 3 commits into
mainfrom
issue-20-s2-memory-feature

Conversation

@LucasErcolano

@LucasErcolano LucasErcolano commented May 31, 2026

Copy link
Copy Markdown
Owner

Summary

Implements S2 Wiki-backed Report Memory for issue #20.

This PR adds a persistent local Markdown Wiki as an auxiliary audit/evidence context layer for ReportAgent. It does not replace Zep, GraphRAG, or the existing operational memory stack. Baseline behavior remains unchanged unless a wiki_context is explicitly available/provided.

Linked issue

Closes #20

What changed

  • Added backend/app/services/wiki_memory/ with:
    • WikiStore
    • WikiCompiler
    • schemas/templates
    • build_wiki_context_for_report(...)
  • Generates the requested per-run/case Wiki artifacts:
    • wiki/agents.md
    • wiki/index.md
    • wiki/timeline.md
    • wiki/sources.md
    • wiki/contradictions.md
    • wiki/entities/*.md
    • wiki/claims/*.md
    • wiki/wiki_meta.json
    • wiki/wiki_compile_log.jsonl
    • root wiki_compile_log.jsonl
    • root wiki_context.md
  • Integrated optional Wiki audit context into ReportAgent planning/section prompts via <wiki_audit_context>.
  • Integrated API fallback: wiki context construction is non-fatal and falls back to wiki_context=None on error/missing data.
  • Added docs: docs/wiki_backed_report_memory.md.
  • Added bounded real-lite smoke script: scripts/real_lite_smoke.py.
  • Added unit/integration/smoke tests for Wiki initialization, safe writes, timeline, claims, compiler, ReportAgent integration, context artifact, and graceful fallback.

Scope / PR hygiene

The PR is now narrowed to the S2 Wiki-backed Report Memory MVP.

Removed from the final branch diff:

  • general experiment harness files
  • experiment_runner.py
  • memory_mode.py
  • memory_* configs
  • experimental-memory docs/configs/tests that belonged to the earlier broader direction
  • unrelated Zep/GraphRAG/OASIS/tooling/config changes

The final diff keeps the feature minimal, local, additive, and opt-in.

MVP opt-in behavior

  • ReportAgent(..., wiki_context=None) is the default.
  • If no wiki/run data exists, build_wiki_context_for_report(...) returns None and does not create wiki directories as a side effect.
  • If context construction fails, API/report generation logs a warning and continues with wiki_context=None.
  • Wiki context is explicitly marked as audit/background context, not ground truth.
  • Zep/GraphRAG remain the operational memory stack.

How to test

Targeted wiki suite:

.venv/bin/python -m pytest tests/test_wiki_compiler.py tests/test_wiki_memory.py tests/test_wiki_memory_additional.py tests/test_wiki_report_integration.py tests/test_wiki_smoke.py -q

Syntax check:

python -m py_compile backend/app/services/wiki_memory/*.py backend/app/services/report_agent.py backend/app/api/report.py scripts/real_lite_smoke.py

Optional bounded real-lite smoke, requiring a configured LLM provider/API key:

.venv/bin/python scripts/real_lite_smoke.py

Expected local smoke outputs are written under runs/wiki_report_memory_real_lite_<timestamp>/ and include wiki/, wiki_compile_log.jsonl, wiki_context.md, baseline/with-wiki reports, and comparison artifacts.

Local verification

116 passed in 1.43s

from:

.venv/bin/python -m pytest tests/test_wiki_compiler.py tests/test_wiki_memory.py tests/test_wiki_memory_additional.py tests/test_wiki_report_integration.py tests/test_wiki_smoke.py -q

and:

python -m py_compile backend/app/services/wiki_memory/*.py backend/app/services/report_agent.py backend/app/api/report.py scripts/real_lite_smoke.py

completed successfully.

Limitations

This PR does not claim:

  • full OASIS validation
  • multi-seed robustness
  • predictive improvement
  • replacement of Zep/GraphRAG
  • a complete new global memory mode
  • UI/Obsidian-style editing
  • vector search over the Wiki
  • statistical forecast-quality evaluation

The smoke validates the S2 integration boundary: real Wiki compilation, real context assembly/injection, safe fallback, and reproducible local artifacts.

Resolves issue #20.

- Add memory_mode feature flag:
  baseline | experimental, env/YAML driven, rollback-safe.
- Add experiment runner:
  deterministic run_id, seed control, snapshot config,
  seed/prompt hashes, results.json export, runs/<case>/<variant>/<seed>/ layout.
- Add docs and configs:
  docs/memory_experimental.md, docs/experiment_harness.md,
  configs/memory_baseline.yaml, configs/memory_experimental.yaml,
  configs/experiments/example_case.yaml,
  configs/experiments/v1_smoke_*.yaml incl. no-report smoke variant.
- Add tests:
  backend/tests/test_memory_mode.py,
  backend/tests/test_experiment_runner.py,
  backend/tests/test_experiment_runner_memory.py.
- Update backend services/tests for experimental memory integration,
  spike baseline/rollback behavior, memory metrics logging, and
  safe backend logger handling.
- Update .gitignore for logs/runs/artifacts.
- Final pre-merge cleanup: move temporary smoke/log artifacts out of tree;
  preserve no-report smoke config for simulation path validation.

Issue: #20
Add optional wiki audit context layer to ReportAgent that compiles
simulation knowledge-base pages into structured context injected into
planning and section-generation prompts. Feature is fully opt-in via
build_wiki_context_for_report()/wiki_context=None — no change to
existing behavior when not activated.

Implementation:
- backend/app/services/wiki_memory/: new package (WikiStore,
  WikiCompiler, schemas, templates) for compiling wiki pages into
  context for report generation
- backend/app/services/report_agent.py: add wiki_context param,
  inject <wiki_audit_context> block into plan_outline and
  generate_section_react prompts with prior-knowledge labeling
- backend/app/api/report.py: integrate wiki context building with
  graceful degradation (non-fatal on error)
- backend/app/services/__init__.py: refactor to lazy-import heavy
  services, eager-export wiki_memory public API

Tests: 116/116 passing (compiler, store, integration, smoke).
Docs: docs/wiki_backed_report_memory.md with MVP activation details.
Smoke: scripts/real_lite_smoke.py for real-LLM verification.
@LucasErcolano LucasErcolano changed the title feat: experimental memory S2 + reproducible harness (closes #20) feat(issue-20): S2 Wiki-backed Report Memory Jun 2, 2026
@LucasErcolano

LucasErcolano commented Jun 6, 2026

Copy link
Copy Markdown
Owner Author

Análisis cualitativo de gaps para cerrar bien la issue S2 (#20):

  • CI / PR Hygiene: ahora falla porque el body no tiene las secciones exactas esperadas por el workflow. Agregar:
    • ## Linked issue con Closes #20
    • ## How to test con comandos/verificación usados.
  • Cumplimiento completo de la issue: asegurar y documentar que existen todos los artefactos pedidos por S2: Wiki-backed Report Memory para auditoría temporal #20: docs/wiki_backed_report_memory.md, WikiStore, WikiCompiler, estructura wiki por run/case, index.md, timeline.md, sources.md, contradictions.md, entities/*.md, claims/*.md, wiki_compile_log.jsonl, wiki_context.md, integración con ReportAgent, tests y smoke real-lite.
  • Scope: la parte de Wiki-backed Report Memory está alineada con S2: Wiki-backed Report Memory para auditoría temporal #20, pero el diff también arrastra piezas de una dirección anterior/generalista (experiment_runner.py, experimental_memory.py, memory_mode.py, configs memory_*, docs de experiment harness). La issue pide una Wiki auxiliar mínima y no invasiva; sacar esos archivos a otro PR o justificar explícitamente por qué son necesarios para cumplir S2: Wiki-backed Report Memory para auditoría temporal #20.
  • MVP opt-in obligatorio: demostrar que la feature no crea un nuevo modo global obligatorio de memoria, no reemplaza Zep/GraphRAG y solo inyecta wiki_context cuando está disponible.
  • Evidencia del smoke: dejar un comando reproducible y acotado para regenerar el smoke real-lite, o una fixture mínima que no dependa de estado local/gitignored.
  • Fallback/runtime: verificar/documentar que si no hay run_dir, metadata o wiki válida, la API/ReportAgent degradan siempre a wiki_context=None sin latencia excesiva ni excepción visible al usuario.
  • Naming conceptual: evitar que “experimental memory” quede como nombre dominante en código/docs si la feature final se llama “Wiki-backed Report Memory”.

En resumen: para cerrar #20, el PR debe cumplir todo el checklist de la issue y mantener el alcance en Wiki-backed Report Memory mínima/opt-in, sin mezclar un framework general de experimental memory si no es requerido por la issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S2: Wiki-backed Report Memory para auditoría temporal

1 participant