Skip to content

feat(s4): P1.5 S4 — Testing infrastructure wiring#92

Merged
bayrem merged 6 commits into
mainfrom
feat/s4-testing-infra
Jun 17, 2026
Merged

feat(s4): P1.5 S4 — Testing infrastructure wiring#92
bayrem merged 6 commits into
mainfrom
feat/s4-testing-infra

Conversation

@bjridicodes

Copy link
Copy Markdown
Contributor

Summary

309 unit tests green (was 294, +15 new). Lint clean.

What remains for S4 acceptance (requires live GCP infrastructure)

Test plan

  • All 309 unit tests pass (pytest tests/unit/ -v)
  • Lint clean (make lint)
  • TestUC1RunbookAcceptance — 3 tests asserting TF log paths and keywords extracted correctly
  • TestUC2RunbookAcceptance — 2 tests asserting Dataproc keyword coverage
  • GCP resource_type filter tests — resource.type and cluster_name in filter string
  • _validate_log_paths tests — safe paths kept, unsafe paths dropped
  • Pipeline resilience test — Agent 4 runs when Agent 3 raises ClassificationError
  • Classifier retry test — second LLM attempt succeeds after first failure

🤖 Generated with Claude Code

bjridicodes and others added 6 commits June 16, 2026 13:48
…ty nodes

Cherry-picked content from PR #82 (Tobi-Adesoye, commit 85b658e).
Only the three runbook files are brought in — the test regression from
that commit (7 deleted KB unit tests) and the orphaned scripts/uc1_parser.py
are intentionally excluded.

These runbooks will be validated against actual TF log paths as part of #60.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rectness

Closes #59 — cluster_hosts.json restructured for UC1 TF node names (cdp-master-01,
cdp-data-01/02, cdp-utility-01, cdp-bus-01). IPs placeholder until TF apply.

Closes #60 — UC1 KB runbooks validated and enriched against TF log paths.
_KEYWORD_RE extended with Kafka, ZooKeeper, NiFi, AuthenticationException,
DiskOutOfSpaceException, GC overhead. test_file_kb.py: restored original 8 tests,
updated fixture count (2→8), added TestUC1RunbookAcceptance (3 tests).

Closes #61 — cdp_ssh_key_secret() config option added (core/config.py); SSH key vault
key now configurable via conf.yaml cdp.ssh_key_secret, default CDP_SSH_KEY unchanged.
api/dependencies.py wired to cfg.cdp_ssh_key_secret(). conf_template.yaml annotated
with UC1 TF secret alignment guidance and full TF log dir paths.

Closes #62 — GCPLogConnector: resource_types param adds resource.type OR-clause and
cluster_name host label alias for Dataproc. api/dependencies.py sets
['cloud_dataproc_cluster', 'cloud_dataproc_job'] for UC2. 3 new filter tests.

Closes #63 — UC2 Dataproc KB runbooks (dataproc_cluster.md, dataproc_job.md) added.
TestUC2RunbookAcceptance (2 tests).

Closes #64 — gcp_native.md added as UC3 graceful degradation marker. Agent 2 returns
LOW confidence / empty logs for native GCP services; Agent 4 notifies with gap message.

Closes #85 — _validate_log_paths() in log_extractor.py: drops LLM-planned paths
outside /var/log/ before passing to connectors. 4 new unit tests.

Closes #83 — ClassificationError caught in _agent3_node (pipeline.py): Agent 4 now
always runs, notify-only guarantee preserved. ClassifierAgent adds 1 retry + 1s sleep
before raising. 1 retry test + 1 pipeline resilience test added.

309 unit tests green (was 294, +15 new).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both dataproc_cluster.md and dataproc_job.md scored equally for "cluster"
queries due to "cluster_name" appearing in the Log Paths section of
dataproc_job.md. This caused non-deterministic test failures in CI where
the wrong runbook was returned for cluster-level incidents (YARN missing).

Remove the token by replacing the multi-line filter block with a single
sentence that doesn't contain "cluster", making dataproc_cluster.md the
unique winner for cluster-targeted queries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous fix replaced cluster_name label text but left two more cluster
occurrences: the word literal in "UC2 cluster:" and the hyphen-split
token from "aria-uc2-cluster". Since _tokenize uses re.findall(r"\w+")
— hyphens split but underscores don't — aria-uc2-cluster tokenises to
["aria","uc2","cluster"], still tying with dataproc_cluster.md for
"dataproc-cluster gcp" queries.

Replace both with "UC2 job runner: aria-uc2-dataproc" so dataproc_job.md
scores 0.5 and dataproc_cluster.md scores 0.75 for cluster queries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…structure

Two architectural fixes in one commit:

1. Split knowledge_base fixtures into resource_kb/ (Agent 2) and analyser_kb/
   (Agent 3). Eliminates the design confusion that put failure vocabulary in
   Agent 2's resource catalog — the root cause of the S4 CI score-tie failures.

2. Consolidate from 8 per-component files to 3 per-cluster files in resource_kb.
   Each file describes a cluster's physical/logical resources and log paths — no
   error keywords, no failure descriptions. The cdp_cluster.md covers all 5 UC1
   nodes in one file; aria_uc2_cluster.md covers Dataproc logical resources.

3. Add analyser_kb/ with 5 labeled log excerpts (OOM, disk, auth, YARN safe mode,
   OK baseline) injected into Agent 3's prompt as few-shot examples. These files
   double as a training corpus for the future fine-tuned Agent 3 model.

4. ClassifierAgent gains analyser_kb_dir param + _load_analyser_kb() loader.
   cfg.analyser_kb_dir() reads knowledge_base.analyser_kb_dir / ARIA_ANALYSER_KB_DIR.
   get_agent3() passes the configured dir at construction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bayrem bayrem merged commit 677ea7a into main Jun 17, 2026
7 checks passed
@bayrem bayrem deleted the feat/s4-testing-infra branch June 17, 2026 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants