densify-dev · gasarekubex · May 12, 2026 · May 8, 2026 · kubexautomation · May 8, 2026
@@ -36,8 +36,8 @@ The Helm chart supports both Helm-managed configuration and manually managed cus
 
 Important:
 
-- The Helm-managed `scope` and `policy.policies` values preserve the existing values-driven flow from `values-edit.yaml` by generating `AutomationStrategy` and `ClusterProactivePolicy`, but those CRs can also be created and managed independently of Helm
-- `ProactivePolicy`, `StaticPolicy`, `ClusterStaticPolicy`, and `ClusterAutomationStrategy` are supported by the controller but are managed as separate CR manifests today
+- The Helm-managed `scope` and `policy.policies` values preserve the existing values-driven flow from `values-edit.yaml` by generating `ClusterAutomationStrategy` and `ClusterProactivePolicy`, but those CRs can also be created and managed independently of Helm
+- `ProactivePolicy`, `StaticPolicy`, `ClusterStaticPolicy`, and namespaced `AutomationStrategy` are supported by the controller but are managed as separate CR manifests today
 
 ## Core Components
 
@@ -101,6 +101,7 @@ This guide covers:
 | **[Global Configuration Reference](./docs/Global-Configuration.md)** | Field-by-field reference for the `GlobalConfiguration` custom resource |
 | **[Policy Configuration](./docs/Policy-Configuration.md)** | Configure strategies, policy scope, precedence, and Helm-managed policy generation |
 | **[Policy Evaluation Reference](./docs/Policy-Evaluation.md)** | Policy type precedence configuration via the `PolicyEvaluation` singleton |
+| **[GPU Sharing with KAI](./docs/GPU-Sharing-with-KAI.md)** | Configure KAI-backed GPU sharing, rebalancing, and early consolidation |
 | **[Apply Updates](./docs/Getting-Started.md#apply-configuration-updates)** | Re-run `helm upgrade` after configuration changes |
 
 ## Advanced Topics

@@ -1,5 +1,7 @@
 # Automation Strategies
 
+> Experimental: GPU/KAI-related fields in this resource are subject to breaking changes. When using them, set `spec.experimental.gpuKaiContract: v1alpha1-2026-04`.
+
 `AutomationStrategy` defines how resizing is allowed to happen within a namespace.
 
 Use it when a team owns its own namespace and should manage resize behavior locally.

@@ -1,5 +1,7 @@
 # Cluster Automation Strategies
 
+> Experimental: GPU/KAI-related fields in this resource are subject to breaking changes. When using them, set `spec.experimental.gpuKaiContract: v1alpha1-2026-04`.
+
 `ClusterAutomationStrategy` defines how resizing is allowed to happen for cluster-scoped policy flows.
 
 Use it when a platform team wants one reusable resize behavior that can be referenced by `ClusterProactivePolicy` and `ClusterStaticPolicy` across multiple namespaces.

@@ -151,6 +151,7 @@ Use [Global Configuration Reference](./Global-Configuration.md) for the CR field
 | `globalConfiguration.webhookProbe.resources` | `{}` | Resource requests and limits for the dry-run webhook probe container |
 | `globalConfiguration.webhookProbe.podSecurityContext` | `{}` | Pod security context for the dry-run webhook probe Pod |
 | `globalConfiguration.webhookProbe.securityContext` | `{}` | Container security context for the dry-run webhook probe container |
+| `experimental.gpuKaiContract` | `v1alpha1-2026-04` | Required acknowledgement token for experimental GPU/KAI CR fields rendered by the chart |
 
 ## Helm-Managed Policy Values
 

@@ -0,0 +1,175 @@
+# GPU Sharing with KAI
+
+This guide shows how to configure GPU sharing with KAI and Kubex Automation Engine.
+
+Tested with KAI `v0.12.16`.
+
+> [!IMPORTANT]
+> GPU/KAI fields and related custom resources are experimental and subject to breaking changes. Set `spec.experimental.gpuKaiContract: v1alpha1-2026-04` on GPU/KAI resources.
+
+## Prerequisites
+
+- KAI is already installed in the cluster
+- `kubex-crds` and `kubex-automation-engine` are already installed
+- Prometheus is available for GPU utilization metrics if you want to use `GpuRebalancingPolicy`
+
+This guide works with either:
+
+- a new KAI installation
+- an existing KAI installation
+
+For existing KAI-managed workloads, Kubex Automation Engine can update the `gpu-fraction` annotation without replacing the existing `kai.scheduler/queue` label.
+
+## Starter Example
+
+The following example creates:
+
+- an `AutomationStrategy` for KAI-enabled workloads in namespace `ml-team-a`
+- a `StaticPolicy` that sets an initial shared GPU request for matching `Deployment` workloads
+- a `GpuRebalancingPolicy` that adjusts that shared GPU request based on Prometheus GPU metrics
+
+Both policies target `Deployment` workloads in a specific namespace that carry `nvidia.com/gpu.present: "true"`.
+
+```yaml
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: AutomationStrategy
+metadata:
+  name: kai-gpu-sharing
+  namespace: ml-team-a
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  enablement:
+    gpu:
+      overrideScheduler: "kai"
+      requests:
+        downsize: true
+        upsize: true
+        setFromUnspecified: false
+  kai:
+    queue: kubex-unlimited-gpu-queue
+    setQueueWhenSpecified: false
+  inPlaceResize:
+    enabled: false
+  podEviction:
+    enabled: true
+---
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: StaticPolicy
+metadata:
+  name: kai-gpu-sharing-baseline
+  namespace: ml-team-a
+spec:
+  scope:
+    labelSelector:
+      matchLabels:
+        nvidia.com/gpu.present: "true"
+    workloadTypes:
+      - Deployment
+  resources:
+    containers:
+      "*":
+        requests:
+          gpu: "0.25"
+  automationStrategyRef:
+    name: kai-gpu-sharing
+---
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: GpuRebalancingPolicy
+metadata:
+  name: kai-gpu-sharing-rebalancing
+  namespace: ml-team-a
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  scope:
+    labelSelector:
+      matchLabels:
+        nvidia.com/gpu.present: "true"
+    workloadTypes:
+      - Deployment
+  minPodMetricsAge: 15m
+  metrics:
+    compute:
+      upsize:
+        thresholdPercent: 125
+        metricsWindow: 10m
+        headroomPercent: 20
+        maxPercent: 200
+      scaleBack:
+        thresholdPercent: 60
+        metricsWindow: 10m
+        headroomPercent: 20
+      prometheus:
+        metric: kubex_gpu_container_compute_utilization_percent
+        namespaceLabel: namespace
+        podLabel: pod
+        containerLabel: container
+    memory:
+      upsize:
+        thresholdPercent: 125
+        metricsWindow: 10m
+        headroomPercent: 20
+        maxPercent: 200
+      scaleBack:
+        thresholdPercent: 60
+        metricsWindow: 10m
+        headroomPercent: 20
+      prometheus:
+        metric: kubex_gpu_container_memory_utilization_percent
+        namespaceLabel: namespace
+        podLabel: pod
+        containerLabel: container
+  automationStrategyRef:
+    name: kai-gpu-sharing
+```
+
+## Automation Strategy Notes
+
+For KAI-enabled workloads, start with `spec.inPlaceResize.enabled: false`.
+
+- Eviction-based resize is the safer path today for KAI-enabled workloads.
+- In-place resizing for KAI-enabled workloads can be experimented with, but it is currently unstable.
+
+## Existing KAI Installations
+
+For workloads that are already scheduled through KAI:
+
+- keep the existing `kai.scheduler/queue` label on the workload template
+- let Kubex Automation Engine update `gpu-fraction` as policies are applied
+
+That allows Kubex Automation Engine to participate in GPU sharing without taking over queue assignment.
+
+If you want queue assignment to be done via Kubex, set `spec.kai.setQueueWhenSpecified: false` in your AutomationStrategy.
+
+## GPU Node Consolidation
+
+`GpuConsolidationPolicy` can be used to consolidate KAI GPU workloads onto fewer GPU nodes.
+
+Example targeting a specific worker pool:
+
+```yaml
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: GpuConsolidationPolicy
+metadata:
+  name: kai-gpu-workers-a
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  nodeSelector:
+    matchLabels:
+      nodepool: gpu-workers-a
+  utilizationThresholdPercent: 70
+  requeueAfter: 2m
+```
+
+## Consolidation Limitations
+
+GPU node consolidation is very early and has known limitations.
+
+- It assumes pods will be schedulable on other nodes if they fit by GPU fraction.
+- It does not yet fully model all other scheduler constraints.
+- That can lead to frequent evictions when the controller chooses a node that looks drainable from GPU capacity alone but cannot actually be rescheduled cleanly.
+- It may behave unpredictably with nodes that have multiple GPUs.
+
+Use it carefully and start with a narrowly scoped worker pool.
@@ -1,5 +1,7 @@
 # Global Configuration
 
+> Experimental: GPU/KAI-related fields in this resource are subject to breaking changes. When using them, set `spec.experimental.gpuKaiContract: v1alpha1-2026-04`.
+
 `GlobalConfiguration` defines cluster-wide controller behavior that applies across strategies and policies.
 
 Use it to control recommendation refresh timing, proactive rescans, heartbeat reporting, global automation switches, protected namespaces, and webhook health thresholds.

@@ -0,0 +1,86 @@
+# GPU Consolidation Policy
+
+> Experimental: GPU/KAI fields and related custom resources are subject to breaking changes. Set `spec.experimental.gpuKaiContract: v1alpha1-2026-04`.
+
+`GpuConsolidationPolicy` is a cluster-scoped controller that looks at scheduled pods carrying the `gpu-fraction` annotation and tries to consolidate them off an underutilized node.
+
+## Behavior
+
+- The controller scans all scheduled, non-terminal pods with `metadata.annotations["gpu-fraction"]`.
+- `spec.nodeSelector` is required and uses standard Kubernetes label selector semantics.
+- Each policy defines one compatibility pool. Create multiple policies when you need multiple compatible node pools.
+- Only nodes selected by `spec.nodeSelector` are considered compatible for candidate selection and destination placement.
+- Selected nodes are expected to be mutually compatible for GPU workload movement.
+- Node GPU capacity is taken from `status.allocatable["nvidia.com/gpu"]`.
+- Nodes with utilization below `spec.utilizationThresholdPercent` are candidates, but nodes with no GPU-fraction pods are ignored.
+- Candidates are evaluated from most underutilized to least underutilized.
+- A node is consolidated only when every GPU-fraction pod on that node can fit onto other non-empty GPU nodes without exceeding their allocatable capacity.
+- The controller evicts all pods from the first drainable candidate node it finds in a reconcile loop.
+- Eviction is node-wide for a selected consolidation candidate: once a node is marked for consolidation, every evictable pod on that node is targeted, including pods without workload owners such as static pods.
+- Reconciliation is policy-driven: the controller runs on `GpuConsolidationPolicy` changes and on the periodic timer from `spec.requeueAfter`.
+- Pod and Node changes do not trigger immediate rescans.
+- If no node can be fully drained, the controller records that outcome in status and waits for the next `spec.requeueAfter`.
+
+## Examples
+
+```yaml
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: GpuConsolidationPolicy
+metadata:
+  name: gpu-consolidation-pool-a
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  nodeSelector:
+    matchLabels:
+      kubex.ai/gpu-pool: pool-a
+  utilizationThresholdPercent: 75
+  requeueAfter: 1m
+```
+
+Use one policy per compatibility pool:
+
+```yaml
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: GpuConsolidationPolicy
+metadata:
+  name: gpu-consolidation-l40s
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  nodeSelector:
+    matchExpressions:
+    - key: kubex.ai/gpu-pool
+      operator: In
+      values:
+      - batch-l40s
+    - key: accelerator.nvidia.com/class
+      operator: In
+      values:
+      - l40s
+  utilizationThresholdPercent: 70
+  requeueAfter: 2m
+---
+apiVersion: rightsizing.kubex.ai/v1alpha1
+kind: GpuConsolidationPolicy
+metadata:
+  name: gpu-consolidation-h100
+spec:
+  experimental:
+    gpuKaiContract: v1alpha1-2026-04
+  nodeSelector:
+    matchLabels:
+      kubex.ai/gpu-pool: training-h100
+  utilizationThresholdPercent: 80
+  requeueAfter: 1m
+```
+
+## Notes
+
+- This policy is cluster-scoped only.
+- `spec.nodeSelector` is the compatibility boundary for consolidation.
+- It is self-contained and does not reference `AutomationStrategy`.
+- Consolidation is based on GPU-fraction capacity only; it does not model CPU, memory, or scheduler affinity constraints.
+- Consolidation drain behavior is not limited to GPU-fraction pods. After a node is selected, the node is drained by evicting all evictable pods on it, even when some of those pods do not have owners.
+- If `spec.nodeSelector` matches no nodes, the policy reports `NoMatchingNodeSelector` and performs no evictions.
+- If you need faster reaction to workload churn, lower `spec.requeueAfter`.