Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ body:
placeholder: |
- AICR version (CLI `aicr version`, API image tag, or commit SHA):
- Install method (release binary / build from source / container image):
- Platform (eks/gke/aks/oke/kind/lke/other):
- Platform (eks/gke/aks/oke/ocp/kind/lke/other):
- Kubernetes version:
- OS (ubuntu/cos/other) + version:
- Kernel version:
Expand All @@ -122,7 +122,7 @@ body:
value: |
- AICR version (CLI `aicr version`, API image tag, or commit SHA):
- Install method (release binary / build from source / container image):
- Platform (eks/gke/aks/oke/kind/lke/other):
- Platform (eks/gke/aks/oke/ocp/kind/lke/other):
- Kubernetes version:
- OS (ubuntu/cos/other) + version:
- Kernel version:
Expand Down
6 changes: 3 additions & 3 deletions api/aicr/v1/server.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ paths:
description: Kubernetes service/environment type. If omitted, treated as "any" (wildcard).
schema:
type: string
enum: [eks, gke, aks, oke, kind, lke, any]
enum: [eks, gke, aks, oke, ocp, kind, lke, any]
default: any
- name: accelerator
in: query
Expand Down Expand Up @@ -479,7 +479,7 @@ paths:
description: Kubernetes service/environment type. If omitted, treated as "any" (wildcard).
schema:
type: string
enum: [eks, gke, aks, oke, kind, lke, any]
enum: [eks, gke, aks, oke, ocp, kind, lke, any]
default: any
- name: accelerator
in: query
Expand Down Expand Up @@ -1240,7 +1240,7 @@ components:
service:
type: string
description: Kubernetes service type
enum: [eks, gke, aks, oke, kind, lke, any]
enum: [eks, gke, aks, oke, ocp, kind, lke, any]
example: eks
accelerator:
type: string
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ NVIDIA AI Cluster Runtime (AICR) is a suite of tooling designed to automate the
|------|-------------|
| **Snapshot** | A captured state of a system including OS, kernel, Kubernetes, GPU, and SystemD configuration. Created by `aicr snapshot` or the Kubernetes agent. |
| **Recipe** | A generated configuration recommendation containing component references, constraints, and deployment order. Created by `aicr recipe` based on criteria or snapshot analysis. |
| **Criteria** | Query parameters that define the target environment: `service` (eks/gke/aks/oke/kind/lke), `accelerator` (h100/gb200/b200/a100/l40/rtx-pro-6000), `intent` (training/inference), `os` (ubuntu/rhel/cos/amazonlinux/talos), `platform` (kubeflow), and `nodes`. |
| **Criteria** | Query parameters that define the target environment: `service` (eks/gke/aks/oke/ocp/kind/lke), `accelerator` (h100/gb200/b200/a100/l40/rtx-pro-6000), `intent` (training/inference), `os` (ubuntu/rhel/cos/amazonlinux/talos), `platform` (kubeflow), and `nodes`. |
| **Overlay** | A recipe metadata file that extends the base recipe for specific environments. Overlays are matched against criteria using asymmetric matching. |
| **Mixin** | A composable recipe fragment (`kind: RecipeMixin`) that carries only `constraints` and `componentRefs`. Mixins live in `recipes/mixins/`, are excluded from overlay discovery, and are referenced by leaf overlays via `spec.mixins` to share orthogonal content (e.g., OS constraints, platform components) without duplication. See [ADR-005](design/005-overlay-refactoring.md). |
| **Bundle** | Deployment artifacts generated from a recipe: Helm values files, Kubernetes manifests, installation scripts, and checksums. |
Expand Down
33 changes: 31 additions & 2 deletions docs/contributor/api-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ Supported content types:

| Parameter | Type | Validation | Example |
|-----------|------|------------|--------|
| `service` | ServiceType | Enum: eks, gke, aks, oke, kind, lke, any | `service=eks` |
| `service` | ServiceType | Enum: eks, gke, aks, oke, ocp, kind, lke, any | `service=eks` |
| `accelerator` | AcceleratorType | Enum: h100, gb200, b200, a100, l40, rtx-pro-6000, any | `accelerator=h100` |
| `gpu` | AcceleratorType | Alias for accelerator | `gpu=h100` |
| `intent` | IntentType | Enum: training, inference, any | `intent=training` |
Expand All @@ -277,7 +277,36 @@ Shared with CLI - same logic as described in CLI architecture.

### Recipe Generation

Endpoints `GET /v1/recipe` (query parameters) and `POST /v1/recipe` (criteria body, `application/json` or `application/x-yaml`). See [Query Parameter Parsing](#query-parameter-parsing) above for the GET parameter table and [POST Request Body Format](#post-request-body-format) above for the body schema.
**Endpoints**:
- `GET /v1/recipe` - Generate recipe from query parameters
- `POST /v1/recipe` - Generate recipe from criteria body

#### GET Method

**Query Parameters**:
- `service` - Kubernetes service type (eks, gke, aks, oke, ocp, kind, lke)
- `accelerator` - GPU/accelerator type (h100, gb200, b200, a100, l40, rtx-pro-6000)
- `gpu` - Alias for accelerator (backwards compatibility)
- `intent` - Workload intent (training, inference)
- `os` - Operating system family (ubuntu, rhel, cos, amazonlinux, talos)
- `nodes` - Number of GPU nodes (0 = any/unspecified)
Comment thread
kaponco marked this conversation as resolved.

#### POST Method

**Content Types**: `application/json`, `application/x-yaml`

**Request Body**: `RecipeCriteria` resource with kind, apiVersion, metadata, and spec fields.

```yaml
kind: RecipeCriteria
apiVersion: aicr.nvidia.com/v1alpha1
metadata:
name: my-criteria
spec:
service: eks
accelerator: h100
intent: training
```

**Response**: 200 OK

Expand Down
2 changes: 1 addition & 1 deletion docs/contributor/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -1701,7 +1701,7 @@ mkdir -p "$OUTPUT_DIR"
GPU_TYPES=("h100" "gb200" "b200" "a100" "l40" "rtx-pro-6000")

# Kubernetes services
K8S_SERVICES=("eks" "gke" "aks" "oke" "kind" "lke")
K8S_SERVICES=("eks" "gke" "aks" "oke" "ocp" "kind" "lke")

# OS distributions
OS_TYPES=("ubuntu" "rhel" "cos")
Expand Down
2 changes: 1 addition & 1 deletion docs/contributor/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Criteria define when a recipe matches a user query:

| Field | Type | Description | Example Values |
|-------|------|-------------|----------------|
| `service` | String | Kubernetes platform | `eks`, `gke`, `aks`, `oke`, `kind`, `lke` |
| `service` | String | Kubernetes platform | `eks`, `gke`, `aks`, `oke`, `ocp`, `kind`, `lke` |
| `accelerator` | String | GPU hardware type | `h100`, `gb200`, `b200`, `a100`, `l40`, `rtx-pro-6000` |
| `os` | String | Operating system | `ubuntu`, `rhel`, `cos`, `amazonlinux` |
| `intent` | String | Workload purpose | `training`, `inference` |
Expand Down
2 changes: 1 addition & 1 deletion docs/contributor/validations.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ conditions:

**Supported Condition Keys:**
- `intent`: Workload intent (training, inference)
- `service`: Kubernetes service (eks, gke, aks, oke, kind, lke)
- `service`: Kubernetes service (eks, gke, aks, oke, ocp, kind, lke)
- `accelerator`: GPU type (h100, gb200, b200, a100, l40, rtx-pro-6000)
- `os`: Operating system (ubuntu, rhel, cos, amazonlinux, talos)
- `platform`: Platform/framework (kubeflow)
Expand Down
Loading