Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Kubex Helm Charts

<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://www.kubex.ai/wp-content/uploads/kubex-by-densify-logo-reverse.png">
<source media="(prefers-color-scheme: light)" srcset="https://www.kubex.ai/wp-content/uploads/kubex-by-densify-logo.png">
<img src="https://www.kubex.ai/wp-content/uploads/kubex-by-densify-logo.png" width="300">
<source media="(prefers-color-scheme: dark)" srcset="https://kubex.ai/wp-content/uploads/kubex-logo-reverse-landscape.svg">
<source media="(prefers-color-scheme: light)" srcset="https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg">
<img src="https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg" width="300">
</picture>

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Expand Down Expand Up @@ -42,7 +42,9 @@ The following charts are meant to be used **only** as dependencies of the [Insta

2. [K8s Ephemeral Storage Metrics](https://github.com/densify-dev/helm-charts/tree/master/charts/k8s-ephemeral-storage-metrics).

3. [Node Labeler](https://github.com/densify-dev/helm-charts/tree/master/charts/node-labeler).
3. [GPU Process Exporter](https://github.com/densify-dev/helm-charts/tree/master/charts/gpu-process-exporter).

4. [Node Labeler](https://github.com/densify-dev/helm-charts/tree/master/charts/node-labeler).

## Usage

Expand Down
4 changes: 2 additions & 2 deletions charts/container-optimization-data-forwarder/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: v2
description: Kubex Data Collector
icon: https://www.kubex.ai/wp-content/uploads/kubex-by-densify-logo.png
icon: https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg
keywords:
- kubex
- densify
Expand All @@ -11,4 +11,4 @@ maintainers:
name: support
name: container-optimization-data-forwarder
type: application
version: 4.0.16
version: 4.0.17
2 changes: 1 addition & 1 deletion charts/container-optimization-data-forwarder/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ The following table lists configuration parameters in values.yaml and their defa

## Documentation

* [Kubex](https://www.docs.kubex.ai)
* [Kubex](https://docs.kubex.ai)

## License

Expand Down
14 changes: 14 additions & 0 deletions charts/gpu-process-exporter/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: v2
Comment thread
tsipo marked this conversation as resolved.
name: gpu-process-exporter
description: A Helm chart for gpu-process-exporter
type: application
icon: https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg
version: 1.0.0
Comment thread
tsipo marked this conversation as resolved.
keywords:
- kubex
- gpu
- process
- exporter
maintainers:
- email: support@kubex.ai
name: support
118 changes: 118 additions & 0 deletions charts/gpu-process-exporter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Kubex GPU Process Exporter Helm Chart

<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://kubex.ai/wp-content/uploads/kubex-logo-reverse-landscape.svg">
<source media="(prefers-color-scheme: light)" srcset="https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg">
<img src="https://kubex.ai/wp-content/uploads/kubex-logo-landscape.svg" width="300">
</picture>

## Purpose

This chart deploys the Kubex GPU process exporter. This exporter addresses the limitations of Nvidia's [DCGM Exporter](https://github.com/NVIDIA/dcgm-exporter) in providing container-level metrics.

## Motivation

The DCGM exporter collects metrics per GPU device, but comes short associating the utilization metrics with the specific container which actually uses the GPU. To do this, the DCGM exporter relies on the [Nvidia device plugin](https://github.com/NVIDIA/k8s-device-plugin). This association has the following issues:

* A basic assumption of the DCGM exporter is that ALL metrics of the device can (and should) be mapped to a **single** container using it. This assumption breaks in the case that the GPU is shared by multiple containers; it is also not the right approach for some metrics (non-utilization), which should not be mapped to containers.
* The DCGM exporter cannot deal with "soft" (software-based) GPU sharing techniques, such as time-slicing or MPS. With each datapoint the exporter randomly reports one of the containers using the GPU simultaneously, and attributes all the utilization to this container.
* The DCGM exporter also cannot deal with [KAI scheduler](https://github.com/kai-scheduler/KAI-Scheduler), which sets "reservation containers" to reserve the GPU, and schedules the actual workloads to utilize it.

The Kubex GPU process exporter addresses these limitations.

## Prerequisites

* A k8s cluster with at least one Nvidia GPU
* All nodes with Nvidia GPUs have to be labeled `nvidia.com/gpu.present=true` (typically done by the [Nvidia GPU OPerator](https://github.com/nvidia/gpu-operator))

## Details

Deploys a DaemonSet with the following requirements:

* RBAC: `Pods - get, list, watch`
* access to `hostPID`
* security context: `privileged` container (runs as root)
* read-only access to the node's `/` filesystem
* read-only access to the node's `/proc` filesystem

## Configuration

The following table lists configuration parameters in values.yaml and their default values.

| Parameter | Mandatory | Description | Default |
| --- | --- | --- | --- |
| `image.repository` | | Exporter image repository. | `densify/gpu-process-exporter` |
| `image.tag` | :white_check_mark: | Exporter image tag. | |
| `image.pullPolicy` | | Exporter image pull policy. | `Always` |
| `serviceAccount.create` | | Create a service account for the exporter. | |
| `serviceAccount.name` | Required when `serviceAccount.create` is `false`. | Service account name to use. | `gpu-process-exporter` |
| `rbac.create` | | Create RBAC resources for Pod read access. | |
| `rbac.clusterRoleName` | | Name of the ClusterRole to create or bind. | `gpu-exporter-role` |
| `rbac.clusterRoleBindingName` | | Name of the ClusterRoleBinding to create. | `gpu-exporter-binding` |
| `prometheusScrape.annotate` | | Add Prometheus scrape annotations to the Service (typically used by [prometheus-community/prometheus helm chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus)). | |
| `prometheusScrape.interval` | | Scrape interval - should match the actual scrape interval of Prometheus (global or explicit) for this exporter. Passed to the exporter as `SCRAPE_INTERVAL` environment variable. | `20s` |
| `port` | | Container and Service metrics port. | `9494` |
| `service.type` | | Kubernetes Service type. | `ClusterIP` |
| `service.annotations` | | Additional annotations to add to the Service. | |
| `hostProcMount` | | Host path mounted into the exporter as `/host/proc`. See [here](#kind-clusters-and-proprietary-driver). | `/proc` |
| `nvmlSearchPath` | | Override path used by the exporter to find NVML shared libraries. See [here](#non-standard-nvml-so-files-location). | |

### Kind clusters and proprietary driver

The `hostProcMount` parameter is **only** required in case of a [kind](https://kind.sigs.k8s.io/) k8s cluster running on a host with a **proprietary** Nvidia driver (e.g. the series of `linux-modules-nvidia-<version>-server-generic` on Ubuntu). The reason for that is that `kind` nodes are Docker containers, and the proprietary Nvidia driver was blocked from understanding the Linux PID namespaces (by calling GPL-only functions), so it only has access to the host PIDs (which do not match the node's PIDs).

This parameter is **NOT** required if the cluster is NOT a `kind` cluster, or if the Nvidia driver uses the newer **Open GPU Kernel Modules** architecture (e.g. the series of `linux-modules-nvidia-<version>-server-open-generic` on Ubuntu), which is permitted access to these GPL-only functions and Linux PID namespaces.

If you have a `kind` cluster and a **proprietary** Nvidia driver, you need to deploy your cluster as follows:

```yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
extraMounts:
- hostPath: /dev/null
containerPath: /var/run/nvidia-container-devices/all
- hostPath: /proc
containerPath: /physical-host-proc
readOnly: true
```

And then specify the parameter `hostProcMount: /physical-host-proc` in the values. This makes sure that the exporter has access to the **host's** `/proc` filesystem.

### Non-standard NVML .so files location

The exporter is required to load the NVML .so files from the **node's filesystem**. This makes sure that the right NVML version which matches the driver is loaded.

The exporter is configured to look by default for well-known standard locations of the NVML .so files as follows:

(`${DEBIAN_LIB_ARCH}` is one of `x86_64-linux-gnu` or `aarch64-linux-gnu`).

| Location | CSP / OS / Installation |
| --- | --- |
| `/home/kubernetes/bin/nvidia/lib64` | GKE COS / GKE GPU Operator with Google driver installer |
| `/opt/nvidia/lib64` | GKE Ubuntu Google driver installer |
| `/usr/local/nvidia/lib64` | NVIDIA container runtime / GKE exposed driver path / kind and nvkind |
| `/run/nvidia/driver/usr/lib64` | NVIDIA GPU Operator driver container, RPM-style |
| `/run/nvidia/driver/usr/lib/${DEBIAN_LIB_ARCH}` | NVIDIA GPU Operator driver container, Debian-style |
| `/usr/lib/${DEBIAN_LIB_ARCH}` | Ubuntu/Debian (GKE Ubuntu, AKS Ubuntu, OKE Ubuntu, kind) |
| `/usr/lib64` | EKS Amazon Linux, AKS Azure Linux, OKE Oracle Linux, Bottlerocket |
| `/lib/${DEBIAN_LIB_ARCH}` | Debian/Ubuntu merged-/usr compatibility |
| `/lib64` | RPM-style compatibility |

If your k8s cluster nodes have a non-standard location for the NVML .so files, the parameter `nvmlSearchPath` is required and should be set this location, as it is mounted under `/host/root/...` . In this case the standard locations are not searched.

---

## Limitations

* Supported architectures: amd64 (x64), arm64

## Documentation

* [Kubex](https://docs.kubex.ai)

## License

Apache 2 Licensed. See [LICENSE](LICENSE) for full details.
65 changes: 65 additions & 0 deletions charts/gpu-process-exporter/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{{/*
Create the image repository to use
*/}}
{{- define "gpu-process-exporter.imageRepository" -}}
{{- default "densify/gpu-process-exporter" .Values.image.repository }}
{{- end }}
{{/*
Create the image tag to use
*/}}
{{- define "gpu-process-exporter.imageTag" -}}
{{- required "image.tag is required" .Values.image.tag }}
{{- end }}
{{/*
Create the image pull policy to use
*/}}
{{- define "gpu-process-exporter.imagePullPolicy" -}}
{{- default "Always" .Values.image.pullPolicy }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "gpu-process-exporter.serviceAccountName" -}}
{{- $serviceAccount := default dict .Values.serviceAccount -}}
{{- if and (hasKey $serviceAccount "create") (eq $serviceAccount.create false) -}}
{{- required "serviceAccount.name is required when serviceAccount.create is false" $serviceAccount.name }}
{{- else -}}
{{- default "gpu-process-exporter" $serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Create the cluster role name to use
*/}}
{{- define "gpu-process-exporter.clusterRoleName" -}}
{{- default "gpu-exporter-role" .Values.rbac.clusterRoleName }}
{{- end }}
{{/*
Create the cluster role binding name to use
*/}}
{{- define "gpu-process-exporter.clusterRoleBindingName" -}}
{{- default "gpu-exporter-binding" .Values.rbac.clusterRoleBindingName }}
{{- end }}
{{/*
Create the Prometheus scrape interval to use
*/}}
{{- define "gpu-process-exporter.prometheusScrapeInterval" -}}
{{- default "20s" .Values.prometheusScrape.interval }}
{{- end }}
{{/*
Create the port to use by the container and service
*/}}
{{- define "gpu-process-exporter.port" -}}
{{- default 9494 .Values.port }}
{{- end }}
{{/*
Create the service type to use
*/}}
{{- define "gpu-process-exporter.serviceType" -}}
{{- default "ClusterIP" .Values.service.type }}
{{- end }}
{{/*
Create the host proc mount to use by the container
*/}}
{{- define "gpu-process-exporter.hostProcMount" -}}
{{- default "/proc" .Values.hostProcMount }}
{{- end }}
103 changes: 103 additions & 0 deletions charts/gpu-process-exporter/templates/daemonset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
apiVersion: apps/v1
Comment thread
tsipo marked this conversation as resolved.
kind: DaemonSet
metadata:
name: {{ .Chart.Name }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
selector:
matchLabels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
template:
metadata:
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
hostPID: true
Comment thread
tsipo marked this conversation as resolved.
nodeSelector:
nvidia.com/gpu.present: "true"
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
serviceAccountName: {{ include "gpu-process-exporter.serviceAccountName" . }}
containers:
- name: exporter
image: "{{ include "gpu-process-exporter.imageRepository" . }}:{{ include "gpu-process-exporter.imageTag" . }}"
imagePullPolicy: {{ include "gpu-process-exporter.imagePullPolicy" . }}
env:
Comment thread
tsipo marked this conversation as resolved.
- name: NVIDIA_VISIBLE_DEVICES
value: "all"
- name: NVIDIA_DRIVER_CAPABILITIES
value: "utility"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: EXPORTER_PORT
value: {{ include "gpu-process-exporter.port" . | quote }}
{{- if .Values.nvmlSearchPath }}
- name: NVML_SEARCH_PATH
value: {{ .Values.nvmlSearchPath | quote }}
{{- end }}
- name: SCRAPE_INTERVAL
value: {{ include "gpu-process-exporter.prometheusScrapeInterval" . | quote }}
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
ports:
Comment thread
tsipo marked this conversation as resolved.
- containerPort: {{ include "gpu-process-exporter.port" . }}
name: metrics
securityContext:
privileged: true
runAsUser: 0
{{- if .Values.resources }}
resources:
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
{{- toYaml .Values.resources | nindent 12 }}
{{- end }}
volumeMounts:
Comment thread
tsipo marked this conversation as resolved.
- name: host-root
mountPath: /host/root
readOnly: true
- name: host-proc
mountPath: /host/proc
readOnly: true
volumes:
- name: host-root
hostPath:
path: /
- name: host-proc
hostPath:
Comment thread
tsipo marked this conversation as resolved.
path: {{ include "gpu-process-exporter.hostProcMount" . }}
---
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
apiVersion: v1
kind: Service
metadata:
name: {{ .Chart.Name }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- $hasAnnotations := or .Values.prometheusScrape.annotate (and .Values.service.annotations (gt (len .Values.service.annotations) 0)) }}
{{- if $hasAnnotations }}
annotations:
{{- if .Values.prometheusScrape.annotate }}
prometheus.io/scrape: {{ .Values.prometheusScrape.annotate | quote }}
prometheus.io/port: {{ include "gpu-process-exporter.port" . | quote }}
prometheus.io/path: "/metrics"
{{- end }}
{{- if .Values.service.annotations }}
{{- toYaml .Values.service.annotations | nindent 4 }}
{{- end }}
{{- end }}
spec:
selector:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
ports:
- name: metrics
protocol: TCP
port: {{ include "gpu-process-exporter.port" . }}
targetPort: {{ include "gpu-process-exporter.port" . }}
type: {{ include "gpu-process-exporter.serviceType" . }}
40 changes: 40 additions & 0 deletions charts/gpu-process-exporter/templates/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{{- if .Values.serviceAccount.create }}
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "gpu-process-exporter.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
---
{{- end }}
{{- if .Values.rbac.create }}
Comment thread
tsipo marked this conversation as resolved.
Comment thread
tsipo marked this conversation as resolved.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "gpu-process-exporter.clusterRoleName" . }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "gpu-process-exporter.clusterRoleBindingName" . }}
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
subjects:
- kind: ServiceAccount
name: {{ include "gpu-process-exporter.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ include "gpu-process-exporter.clusterRoleName" . }}
apiGroup: rbac.authorization.k8s.io
{{- end }}
Loading