feat(validator): support NIM workloads in inference-perf so NIM recipes can ship a real performance gate

## Summary

The `inference-perf` check in \`validators/performance/inference_perf_constraint.go:193-197\` short-circuits with \`status: \"skipped - dynamo-platform not in recipe components\"\` whenever the resolved recipe's \`componentRefs\` lacks \`dynamo-platform\`. A NIM recipe (\`h100-eks-ubuntu-inference-nim\` and any future NIM leaves) declares \`k8s-nim-operator\` instead, so the validator silently skips. Adding a placeholder \`inference-perf\` block to a NIM overlay would satisfy the floor-test letter but ship no real runtime gate.

## Motivation / Context

Surfaced during Codex review of #1009. The original PR added a placeholder \`inference-perf\` block to \`h100-eks-ubuntu-inference-nim\`, which the strict floor accepted but the validator would never actually run. PR #1009 was updated to revert the NIM block — see #1009 commit history — leaving NIM coverage genuinely absent until this issue is closed.

Contributor docs (\`docs/contributor/validations.md\`, search for \`inference-perf\`) currently describe the check as inference + Dynamo plus a \`DynamoGraphDeployment\` workload. The skip behavior is intentional under that contract, but it means every NIM recipe ships without performance validation.

## Proposed scope

Pick one of the following directions (file additional follow-ups if the chosen direction isn't a single PR):

**Direction A — Extend \`inference-perf\` to NIM**
- Detect \`k8s-nim-operator\` (or \`dynamo-platform\`) at runtime; pick the corresponding deployment path.
- For NIM, deploy a representative \`NIMService\` (or equivalent CR per the NIM operator schema) and benchmark against it instead of a \`DynamoGraphDeployment\`.
- Same output metrics (\`inference-throughput\`, \`inference-ttft-p99\`) so existing constraint names continue to work.

**Direction B — Introduce a sibling check (e.g., \`nim-inference-perf\`)**
- Mirrors the existing one but only runs when \`k8s-nim-operator\` is in components.
- NIM overlays declare \`checks: [nim-inference-perf]\` instead.
- Pros: keeps the two runtimes' deployment surfaces separate.
- Cons: requires constraint-name divergence if metrics differ.

**Direction C — Generic harness with pluggable runtimes**
- Refactor \`inference-perf\` to a small dispatch table keyed on detected runtime.
- Future runtimes (e.g., vLLM Production Stack, TRT-LLM) plug in by adding a deploy + collect implementation.

A is the smallest change for now; C is the cleanest if more runtimes are coming.

## Done when

- An \`inference-perf\` (or sibling) gate produces real numbers for a representative NIM microservice on H100 / EKS / Ubuntu.
- \`h100-eks-ubuntu-inference-nim.yaml\` declares the matching \`performance.checks\` block with empirically-grounded thresholds (or smoke-test floors, with a comment to that effect).
- The strict-mode floor test (\`AICR_VALIDATION_FLOOR_STRICT=1 go test ./pkg/recipe/... -run TestOverlayValidationPhaseFloor\`) no longer flags \`h100-eks-ubuntu-inference-nim\`.

## Out of scope (track separately)

- Multi-model NIM benchmarking — pick one model (e.g., Qwen3 or Llama 3.1) to establish baseline; extend later.
- AKS / GKE / OKE NIM leaves — file as testbed availability lands.

## Related

- #1009 — PR that initially added (then reverted) a placeholder NIM perf block
- #1005 — addressable EKS+GKE perf work (NIM block deferred out)
- #969 — deployment-validation coverage (parent audit)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(validator): support NIM workloads in inference-perf so NIM recipes can ship a real performance gate #1010

Summary

Motivation / Context

Proposed scope

Done when

Out of scope (track separately)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(validator): support NIM workloads in inference-perf so NIM recipes can ship a real performance gate #1010

Description

Summary

Motivation / Context

Proposed scope

Done when

Out of scope (track separately)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions