From 321962f393e8fb8f6660fb24c84b72bec902e08c Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Mon, 23 Mar 2026 00:41:57 +0000 Subject: [PATCH 1/6] docs(spark): add local Ollama inference setup section Add step-by-step instructions for setting up local inference with Ollama on DGX Spark, covering NVIDIA runtime verification, Ollama install and model pre-load, OLLAMA_HOST=0.0.0.0 configuration, and sandbox connection with verification. Fixes #314, #385 --- spark-install.md | 81 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/spark-install.md b/spark-install.md index 140b36d02..71bc36b72 100644 --- a/spark-install.md +++ b/spark-install.md @@ -95,6 +95,87 @@ newgrp docker # or log out and back in nemoclaw onboard ``` +## Setup Local Inference (Ollama) + +Use this to run inference locally on the DGX Spark's GPU instead of routing to NVIDIA cloud. + +### Step 1: Verify NVIDIA Container Runtime + +```bash +docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi +``` + +If this fails, configure the NVIDIA runtime and restart Docker: + +```bash +sudo nvidia-ctk runtime configure --runtime=docker +sudo systemctl restart docker +``` + +### Step 2: Install Ollama + +```bash +curl -fsSL https://ollama.com/install.sh | sh +``` + +Verify it is running: + +```bash +curl http://localhost:11434 +``` + +### Step 3: Pull and Pre-load a Model + +Download Nemotron 3 Super 120B (~87 GB; may take several minutes): + +```bash +ollama pull nemotron-3-super:120b +``` + +Run it briefly to pre-load weights into unified memory, then exit: + +```bash +ollama run nemotron-3-super:120b +# type /bye to exit +``` + +### Step 4: Configure Ollama to Listen on All Interfaces + +By default Ollama binds to `127.0.0.1`, which is not reachable from inside the sandbox container. Configure it to listen on all interfaces: + +```bash +sudo mkdir -p /etc/systemd/system/ollama.service.d +printf '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"\n' | sudo tee /etc/systemd/system/ollama.service.d/override.conf + +sudo systemctl daemon-reload +sudo systemctl restart ollama +``` + +Verify Ollama is listening on `0.0.0.0`: + +```bash +sudo netstat -nap | grep 11434 +``` + +### Step 5: Install OpenShell and NemoClaw + +```bash +curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh +curl -fsSL https://nvidia.com/nemoclaw.sh | bash +``` + +When prompted for **Inference options**, select **Local Ollama**, then select the model you pulled. + +### Step 6: Connect and Test + +```bash +# Connect to the sandbox +nemoclaw my-assistant connect + +# Inside the sandbox, talk to the agent +openclaw agent --agent main --local -m "Which model and GPU are in use?" --session-id test +``` + ## Known Issues | Issue | Status | Workaround | From 13bbde0627fc26d6e0a940f73816b420091b4325 Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Mon, 23 Mar 2026 01:19:13 +0000 Subject: [PATCH 2/6] docs(spark): prefer ss over netstat for listener verification netstat requires net-tools which is not installed by default on Ubuntu 24.04. ss from iproute2 is available by default and is more reliable for verifying listening sockets. Signed-off-by: Paritosh Dixit --- spark-install.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/spark-install.md b/spark-install.md index 71bc36b72..09632dceb 100644 --- a/spark-install.md +++ b/spark-install.md @@ -97,7 +97,7 @@ nemoclaw onboard ## Setup Local Inference (Ollama) -Use this to run inference locally on the DGX Spark's GPU instead of routing to NVIDIA cloud. +Use this to run inference locally on the DGX Spark's GPU instead of routing to cloud. ### Step 1: Verify NVIDIA Container Runtime @@ -151,10 +151,10 @@ sudo systemctl daemon-reload sudo systemctl restart ollama ``` -Verify Ollama is listening on `0.0.0.0`: +Verify Ollama is listening on all interfaces: ```bash -sudo netstat -nap | grep 11434 +ss -tlnp | grep 11434 ``` ### Step 5: Install OpenShell and NemoClaw From a9dbc13e8855c50e06595bbd6295ee5102983a7f Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Mon, 23 Mar 2026 01:25:17 +0000 Subject: [PATCH 3/6] docs(spark): add direct inference.local check in Step 6 Add explicit curl to https://inference.local/v1/models inside the sandbox to validate the proxy route before running the agent. This prevents fallback paths from masking regressions in the fix for #314. --- spark-install.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/spark-install.md b/spark-install.md index 09632dceb..78e76c3e9 100644 --- a/spark-install.md +++ b/spark-install.md @@ -171,8 +171,18 @@ When prompted for **Inference options**, select **Local Ollama**, then select th ```bash # Connect to the sandbox nemoclaw my-assistant connect +``` + +Inside the sandbox, first verify `inference.local` is reachable directly (must use HTTPS — the proxy intercepts `CONNECT inference.local:443`): -# Inside the sandbox, talk to the agent +```bash +curl -s https://inference.local/v1/models +# Expected: JSON response listing the configured model +``` + +Then talk to the agent: + +```bash openclaw agent --agent main --local -m "Which model and GPU are in use?" --session-id test ``` From 8d02c4d693cd9a8b835deb2942211cf1405eaf99 Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Mon, 23 Mar 2026 01:33:13 +0000 Subject: [PATCH 4/6] docs(spark): fail fast on non-200 from inference.local probe Use curl -sf so the check exits non-zero on HTTP errors (403, 503, etc.), preventing a silent 403 from masking a proxy routing regression. Signed-off-by: Paritosh Dixit --- spark-install.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/spark-install.md b/spark-install.md index 78e76c3e9..0fce0eaef 100644 --- a/spark-install.md +++ b/spark-install.md @@ -161,7 +161,7 @@ ss -tlnp | grep 11434 ```bash curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh -curl -fsSL https://nvidia.com/nemoclaw.sh | bash +curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash ``` When prompted for **Inference options**, select **Local Ollama**, then select the model you pulled. @@ -176,8 +176,9 @@ nemoclaw my-assistant connect Inside the sandbox, first verify `inference.local` is reachable directly (must use HTTPS — the proxy intercepts `CONNECT inference.local:443`): ```bash -curl -s https://inference.local/v1/models +curl -sf https://inference.local/v1/models # Expected: JSON response listing the configured model +# Exits non-zero on HTTP errors (403, 503, etc.) — failure here indicates a proxy routing regression ``` Then talk to the agent: From f4b660efb43ab5b86e36fe2b02e382f5a1f6ffe7 Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Mon, 23 Mar 2026 23:57:06 +0000 Subject: [PATCH 5/6] docs: Move local ollama inference section up Signed-off-by: Paritosh Dixit --- spark-install.md | 133 +++++++++++++++++++++++------------------------ 1 file changed, 64 insertions(+), 69 deletions(-) diff --git a/spark-install.md b/spark-install.md index 1d65fb95c..406e54047 100644 --- a/spark-install.md +++ b/spark-install.md @@ -19,7 +19,7 @@ curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | git clone https://github.com/NVIDIA/NemoClaw.git cd NemoClaw -# Spark-specific setup +# Spark-specific setup (For details see [What's Different on Spark](#whats-different-on-spark)) sudo ./scripts/setup-spark.sh # Install NemoClaw using the NemoClaw/install.sh: @@ -29,65 +29,21 @@ sudo ./scripts/setup-spark.sh curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash ``` -## What's Different on Spark - -DGX Spark ships **Ubuntu 24.04 + Docker 28.x** but no k8s/k3s. OpenShell embeds k3s inside a Docker container, which hits two problems on Spark: - -### 1. Docker permissions - -```text -Error in the hyper legacy client: client error (Connect) - Permission denied (os error 13) -``` - -**Cause**: Your user isn't in the `docker` group. -**Fix**: `setup-spark` runs `usermod -aG docker $USER`. You may need to log out and back in (or `newgrp docker`) for it to take effect. - -### 2. cgroup v2 incompatibility - -```text -K8s namespace not ready -openat2 /sys/fs/cgroup/kubepods/pids.max: no -Failed to start ContainerManager: failed to initialize top level QOS containers -``` - -**Cause**: Spark runs cgroup v2 (Ubuntu 24.04 default). OpenShell's gateway container starts k3s, which tries to create cgroup v1-style paths that don't exist. The fix is `--cgroupns=host` on the container, but OpenShell doesn't expose that flag. - -**Fix**: `setup-spark` sets `"default-cgroupns-mode": "host"` in `/etc/docker/daemon.json` and restarts Docker. This makes all containers use the host cgroup namespace, which is what k3s needs. - -## Manual Setup (if setup-spark doesn't work) - -### Fix Docker cgroup namespace +## Verifying Your Install ```bash -# Check if you're on cgroup v2 -stat -fc %T /sys/fs/cgroup/ -# Expected: cgroup2fs - -# Add cgroupns=host to Docker daemon config -sudo python3 -c " -import json, os -path = '/etc/docker/daemon.json' -d = json.load(open(path)) if os.path.exists(path) else {} -d['default-cgroupns-mode'] = 'host' -json.dump(d, open(path, 'w'), indent=2) -" - -# Restart Docker -sudo systemctl restart docker -``` - -### Fix Docker permissions +# Check sandbox is running +nemoclaw my-assistant connect -```bash -sudo usermod -aG docker $USER -newgrp docker # or log out and back in +# Inside the sandbox, talk to the agent: +openclaw agent --agent main --local -m "hello" --session-id test ``` -### Then run the onboard wizard +## Uninstall (perform this before re-installing) ```bash -nemoclaw onboard +# Uninstall NemoClaw (Remove OpenShell sandboxes, gateway, NemoClaw providers, related Docker containers, images, volumes and configs) +nemoclaw uninstall ``` ## Setup Local Inference (Ollama) @@ -182,6 +138,61 @@ Then talk to the agent: openclaw agent --agent main --local -m "Which model and GPU are in use?" --session-id test ``` +## What's Different on Spark + +DGX Spark ships **Ubuntu 24.04 + Docker 28.x** but no k8s/k3s. OpenShell embeds k3s inside a Docker container, which hits two problems on Spark: + +### 1. Docker permissions + +```text +Error in the hyper legacy client: client error (Connect) + Permission denied (os error 13) +``` + +**Cause**: Your user isn't in the `docker` group. +**Fix**: `setup-spark` runs `usermod -aG docker $USER`. You may need to log out and back in (or `newgrp docker`) for it to take effect. + +### 2. cgroup v2 incompatibility + +```text +K8s namespace not ready +openat2 /sys/fs/cgroup/kubepods/pids.max: no +Failed to start ContainerManager: failed to initialize top level QOS containers +``` + +**Cause**: Spark runs cgroup v2 (Ubuntu 24.04 default). OpenShell's gateway container starts k3s, which tries to create cgroup v1-style paths that don't exist. The fix is `--cgroupns=host` on the container, but OpenShell doesn't expose that flag. + +**Fix**: `setup-spark` sets `"default-cgroupns-mode": "host"` in `/etc/docker/daemon.json` and restarts Docker. This makes all containers use the host cgroup namespace, which is what k3s needs. + +## Manual Setup (if setup-spark doesn't work) + +### Fix Docker cgroup namespace + +```bash +# Check if you're on cgroup v2 +stat -fc %T /sys/fs/cgroup/ +# Expected: cgroup2fs + +# Add cgroupns=host to Docker daemon config +sudo python3 -c " +import json, os +path = '/etc/docker/daemon.json' +d = json.load(open(path)) if os.path.exists(path) else {} +d['default-cgroupns-mode'] = 'host' +json.dump(d, open(path, 'w'), indent=2) +" + +# Restart Docker +sudo systemctl restart docker +``` + +### Fix Docker permissions + +```bash +sudo usermod -aG docker $USER +newgrp docker # or log out and back in +``` + ## Known Issues | Issue | Status | Workaround | @@ -192,22 +203,6 @@ openclaw agent --agent main --local -m "Which model and GPU are in use?" --sessi | Image pull failure (k3s can't find built image) | OpenShell bug | `openshell gateway destroy && openshell gateway start`, re-run setup | | GPU passthrough | Untested on Spark | Should work with `--gpu` flag if NVIDIA Container Toolkit is configured | -## Verifying Your Install - -```bash -# Check sandbox is running -openshell sandbox list -# Should show: nemoclaw Ready - -# Test the agent -openshell sandbox connect nemoclaw -# Inside sandbox: -nemoclaw-start openclaw agent --agent main --local -m 'hello' --session-id test - -# Monitor network egress (separate terminal) -openshell term -``` - ## Architecture Notes ```text From 9ac0867cebb9bf3b003fe653a14d4ac2cd52eada Mon Sep 17 00:00:00 2001 From: Paritosh Dixit Date: Tue, 24 Mar 2026 13:21:54 +0000 Subject: [PATCH 6/6] docs: Resolved review comments Signed-off-by: Paritosh Dixit --- spark-install.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/spark-install.md b/spark-install.md index 406e54047..0898ae801 100644 --- a/spark-install.md +++ b/spark-install.md @@ -105,12 +105,16 @@ sudo systemctl restart ollama Verify Ollama is listening on all interfaces: ```bash -ss -tlnp | grep 11434 +sudo ss -tlnp | grep 11434 ``` ### Step 5: Install OpenShell and NemoClaw ```bash +# If the OpenShell and NemoClaw are already installed, uninstall them. A fresh NemoClaw install will run onboard with local inference options. +nemoclaw uninstall + +# Install OpenShell and NemoClaw curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash ```