fix(security): stop passing NVIDIA_API_KEY into sandbox and command lines by ericksoa · Pull Request #675 · NVIDIA/NemoClaw

ericksoa · 2026-03-23T00:21:24Z

Summary

Closes #429. Lands independently of #617.

The OpenShell gateway proxies inference and injects stored credentials server-side — the raw NVIDIA_API_KEY was never needed inside the sandbox but was passed via env args, setup.sh, walkthrough commands, and the setupSpark sudo call, exposing it in ps aux, /proc/pid/cmdline, docker inspect, and k3s audit logs.

Remove NVIDIA_API_KEY from openshell sandbox create env args (onboard.js, setup.sh)
Use env-name-only --credential NVIDIA_API_KEY form in setup.sh (same pattern as fix(security): stop leaking API keys in process args visible via ps aux #330)
Remove key from walkthrough.sh tmux/connect commands
Remove unnecessary key + ensureApiKey() from setupSpark (script never reads it)
Clear key from process.env after setupInference handoff
Add 6 regression tests for credential exposure

What this does NOT fix

/proc/pid/environ of the nemoclaw process itself — kernel snapshot is immutable after exec. delete process.env only prevents child process inheritance. True fix requires file-based credential loading (OpenShell provider model change).
--credential NVIDIA_API_KEY=<value> in onboard.js setupInference — that's fix(security): stop leaking API keys in process args visible via ps aux #330's scope.
Messaging tokens in sandbox env — blocked on fix: standardize messaging integrations with host-side bridge framework #617 merge.

Why it works

Verified in OpenShell source:

proxy.rs:1068 — gateway strips all Authorization / X-Api-Key headers from sandbox requests
backend.rs:101 — gateway re-authenticates upstream using stored provider key
grpc.rs:3446 — inference provider credentials are NOT injected into sandbox env
nemoclaw-start.sh:19 — write_auth_profile() gracefully no-ops if key is absent

Test plan

225/225 tests pass (including 6 new regression tests)
E2E: nemoclaw onboard completes, inference works through gateway without key in sandbox

Summary by CodeRabbit

Tests
- Added tests to ensure sensitive credentials are not embedded or forwarded during agent setup and sandbox creation.
Chores
- Stopped forwarding the GPU/API key into sandbox and setup commands; added post-setup environment cleanup to avoid propagation.
- Updated walkthrough and inline documentation to reflect gateway-side credential handling and streamlined agent startup.

coderabbitai · 2026-03-23T00:21:38Z

📝 Walkthrough

Walkthrough

NVIDIA API key injection moved from client/sandbox env vars to server-side OpenShell gateway injection. The key is no longer forwarded into sandbox or child processes; onboard now explicitly deletes process.env.NVIDIA_API_KEY. Tests added to assert the key is not embedded in scripts or commands. (50 words)

Changes

Cohort / File(s)	Summary
Onboard & sandbox creation `bin/lib/onboard.js`	Removed forwarding of `NVIDIA_API_KEY` into sandbox envArgs; added documentation about gateway-based inference auth; added `delete process.env.NVIDIA_API_KEY` after `setupInference(...)`.
Spark/setup script invocation `bin/nemoclaw.js`	Removed `ensureApiKey()` precondition and stopped passing `NVIDIA_API_KEY` through `sudo` into `setup-spark.sh`.
Shell setup & sandbox CLI `scripts/setup.sh`, `scripts/walkthrough.sh`	Changed provider/config arg to avoid embedding `NVIDIA_API_KEY=$...`; removed `-- env NVIDIA_API_KEY=...` from sandbox create/connect and tmux/agent commands; added comments about gateway injection.
Tests `test/runner.test.js`	Added `credential exposure guards (`#429`)` tests to statically verify `NVIDIA_API_KEY` is not present in sandbox env args, setup scripts, spark invocations, or walkthrough commands.

Sequence Diagram(s)

sequenceDiagram
    participant DevShell as Dev Shell
    participant CLI as Nemoclaw CLI
    participant Gateway as OpenShell Gateway
    participant Sandbox as Sandbox Container
    participant Inference as Inference Service

    DevShell->>CLI: run onboard / setup
    CLI->>Gateway: request sandbox + credentials (server-side)
    Gateway->>Sandbox: inject short-lived credential proxy (not env var)
    CLI->>Sandbox: start agent (no NVIDIA_API_KEY in env)
    Sandbox->>Gateway: proxied inference requests (Authorization stripped by proxy)
    Gateway->>Inference: forward authenticated requests
    Note right of Gateway: process.env.NVIDIA_API_KEY deleted in parent CLI

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I tuck the keys away at night,

No env var shows, no log takes flight,
The gateway hums a quiet tune,
Credentials kept beneath the moon,
— Hopsecure, the guarding hare 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main security fix: preventing NVIDIA_API_KEY from being passed into sandbox and command lines.
Linked Issues check	✅ Passed	All code objectives from `#429` are met: API key removed from sandbox env args [onboard.js, setup.sh], env-name-only credential form added [setup.sh], key removed from tmux/connect commands [walkthrough.sh], process.env cleanup added [onboard.js], and regression tests added.
Out of Scope Changes check	✅ Passed	All changes directly support the security fix for `#429`; no unrelated changes detected. Modifications to onboard.js, nemoclaw.js, setup.sh, walkthrough.sh, and test additions are all within scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/issue-429-credential-environ-exposure

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/walkthrough.sh`:
- Around line 46-49: Remove the hard failure that requires NVIDIA_API_KEY by
deleting or changing the conditional block that checks [ -n
"${NVIDIA_API_KEY:-}" ] and exits (the NVIDIA_API_KEY guard); instead, either
remove the check entirely or replace it with a non-fatal warning (echo a message
if empty but do not call exit 1) so the walkthrough can run without that secret
in the host environment; update any references to NVIDIA_API_KEY in the script
to handle an empty value safely where actually needed.

In `@test/runner.test.js`:
- Around line 263-274: The current test "walkthrough.sh does not embed
NVIDIA_API_KEY in tmux or sandbox commands" only scans tmux/openshell lines
(cmdLines) so other exposures can slip through; update the test to scan the full
script content (src) — or at least all non-comment/non-echo executable lines —
for the string "NVIDIA_API_KEY" and assert no matches, replacing the filtered
cmdLines logic with a broader filter over src.split("\n") (reusing variables
like src, fs, path) so any direct env usage anywhere in walkthrough.sh fails the
test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 10e304b6-c6bb-4666-9a41-7b37e1097a57

📥 Commits

Reviewing files that changed from the base of the PR and between d37a09f and f166577.

📒 Files selected for processing (5)

bin/lib/onboard.js
bin/nemoclaw.js
scripts/setup.sh
scripts/walkthrough.sh
test/runner.test.js

coderabbitai · 2026-03-23T00:25:23Z

scripts/walkthrough.sh

+[ -n "${NVIDIA_API_KEY:-}" ] || {
+  echo "NVIDIA_API_KEY required"
+  exit 1
+}


⚠️ Potential issue | 🟠 Major

Remove hard NVIDIA_API_KEY runtime requirement from walkthrough.

The walkthrough no longer injects this key into sandbox commands, so failing fast here unnecessarily keeps the secret in host process environments.

🔧 Proposed fix

-[ -n "${NVIDIA_API_KEY:-}" ] || { - echo "NVIDIA_API_KEY required" - exit 1 -} +# NVIDIA_API_KEY is not required at walkthrough runtime. +# Credentials are resolved server-side by the OpenShell gateway provider config.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

[ -n "${NVIDIA_API_KEY:-}" ] || {

echo "NVIDIA_API_KEY required"

exit 1

}

# NVIDIA_API_KEY is not required at walkthrough runtime.

# Credentials are resolved server-side by the OpenShell gateway provider config.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/walkthrough.sh` around lines 46 - 49, Remove the hard failure that requires NVIDIA_API_KEY by deleting or changing the conditional block that checks [ -n "${NVIDIA_API_KEY:-}" ] and exits (the NVIDIA_API_KEY guard); instead, either remove the check entirely or replace it with a non-fatal warning (echo a message if empty but do not call exit 1) so the walkthrough can run without that secret in the host environment; update any references to NVIDIA_API_KEY in the script to handle an empty value safely where actually needed.

coderabbitai · 2026-03-23T00:25:23Z

test/runner.test.js

+    it("walkthrough.sh does not embed NVIDIA_API_KEY in tmux or sandbox commands", () => {
+      const fs = require("fs");
+      const src = fs.readFileSync(path.join(__dirname, "..", "scripts", "walkthrough.sh"), "utf-8");
+      // Check only executable lines (tmux spawn, openshell connect) — not comments/docs
+      const cmdLines = src.split("\n").filter(
+        (l) => !l.trim().startsWith("#") && !l.trim().startsWith("echo") &&
+               (l.includes("tmux") || l.includes("openshell sandbox connect"))
+      );
+      for (const line of cmdLines) {
+        expect(line.includes("NVIDIA_API_KEY")).toBe(false);
+      }
+    });


⚠️ Potential issue | 🟡 Minor

Credential regression guard is too narrow for walkthrough script.

This check only inspects tmux/connect command lines, so a direct env dependency/exposure path can slip through undetected.

🧪 Proposed test hardening

it("walkthrough.sh does not embed NVIDIA_API_KEY in tmux or sandbox commands", () => { const fs = require("fs"); const src = fs.readFileSync(path.join(__dirname, "..", "scripts", "walkthrough.sh"), "utf-8"); + // Guard against runtime env preconditions that force secret presence. + expect(src.includes('[ -n "${NVIDIA_API_KEY:-}" ]')).toBe(false); // Check only executable lines (tmux spawn, openshell connect) — not comments/docs const cmdLines = src.split("\n").filter( (l) => !l.trim().startsWith("#") && !l.trim().startsWith("echo") && (l.includes("tmux") || l.includes("openshell sandbox connect")) ); for (const line of cmdLines) { expect(line.includes("NVIDIA_API_KEY")).toBe(false); } });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/runner.test.js` around lines 263 - 274, The current test "walkthrough.sh does not embed NVIDIA_API_KEY in tmux or sandbox commands" only scans tmux/openshell lines (cmdLines) so other exposures can slip through; update the test to scan the full script content (src) — or at least all non-comment/non-echo executable lines — for the string "NVIDIA_API_KEY" and assert no matches, replacing the filtered cmdLines logic with a broader filter over src.split("\n") (reusing variables like src, fs, path) so any direct env usage anywhere in walkthrough.sh fails the test.

Main switched to --credential "OPENAI_API_KEY" (env-lookup form from #675). We set the proxy token in the environment so openshell reads it via the env-name-only pattern while still using our random per-instance token instead of the static "ollama" dummy value.

cv

LGTM — important security fix. The gateway already injects credentials server-side, so the sandbox never needed the raw key. Good documentation of the /proc/pid/environ limitation and solid regression tests.

…ines The OpenShell gateway proxies inference requests and injects stored credentials server-side (proxy.rs strips client auth headers, backend.rs re-authenticates upstream). The raw key was never needed inside the sandbox but was passed via env args, setup.sh, walkthrough commands, and the setupSpark sudo call — exposing it in ps aux, /proc/pid/cmdline, docker inspect, and k3s audit logs. Changes: - Remove NVIDIA_API_KEY from openshell sandbox create env args - Use env-name-only credential form in setup.sh - Remove key from walkthrough.sh tmux/connect commands - Remove unnecessary key + ensureApiKey() from setupSpark - Clear key from process.env after setupInference handoff - Add 6 regression tests for credential exposure Does NOT fix /proc/pid/environ (kernel snapshot is immutable after exec — requires file-based credential loading in OpenShell). Messaging tokens left in sandbox env pending #617 merge. Closes #429.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/setup.sh`:
- Around line 198-202: The openshell sandbox creation call (openshell sandbox
create ... >"$CREATE_LOG" 2>&1) can inherit the exported NVIDIA_API_KEY from the
parent shell; before invoking openshell (or as a prefix), remove that env var to
avoid leaking credentials — either run unset NVIDIA_API_KEY in the script just
before the openshell sandbox create invocation or invoke the command with env -u
NVIDIA_API_KEY to run it with that variable removed; update the section around
the SANDBOX_NAME/CREATE_LOG invocation accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 36a8fe36-f587-4dbf-a235-d7e3536ca100

📥 Commits

Reviewing files that changed from the base of the PR and between f166577 and 123b492.

📒 Files selected for processing (5)

bin/lib/onboard.js
bin/nemoclaw.js
scripts/setup.sh
scripts/walkthrough.sh
test/runner.test.js

🚧 Files skipped from review as they are similar to previous changes (4)

bin/nemoclaw.js
bin/lib/onboard.js
scripts/walkthrough.sh
test/runner.test.js

coderabbitai · 2026-03-23T21:04:41Z

scripts/setup.sh

+# NVIDIA_API_KEY is NOT passed into the sandbox. Inference is proxied through
+# the OpenShell gateway which injects the stored credential server-side.
 openshell sandbox create --from "$BUILD_CTX/Dockerfile" --name "$SANDBOX_NAME" \
  --provider nvidia-nim \
-  -- env NVIDIA_API_KEY="$NVIDIA_API_KEY" >"$CREATE_LOG" 2>&1
+  >"$CREATE_LOG" 2>&1


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail nl -ba scripts/setup.sh | sed -n '132,210p' rg -n 'unset NVIDIA_API_KEY|env -u NVIDIA_API_KEY|openshell sandbox create' scripts/setup.sh

Repository: NVIDIA/NemoClaw

Length of output: 101

🏁 Script executed:

#!/bin/bash # Re-check without nl command cat -n scripts/setup.sh | sed -n '132,210p' echo "---" rg -n 'unset NVIDIA_API_KEY|env -u NVIDIA_API_KEY|openshell sandbox create' scripts/setup.sh

Repository: NVIDIA/NemoClaw

Length of output: 3759

Unset or scrub NVIDIA_API_KEY before sandbox creation to prevent child-process environment inheritance.

Although the command no longer explicitly passes -- env, the variable remains in the shell's exported environment and will be inherited by the sandbox process. Add either unset NVIDIA_API_KEY before the sandbox creation call, or use env -u NVIDIA_API_KEY as a prefix to the command.

Suggested patch

upsert_provider \ "nvidia-nim" \ "openai" \ "NVIDIA_API_KEY" \ "OPENAI_BASE_URL=https://integrate.api.nvidia.com/v1" + +# Credential is now stored server-side; prevent further child-process inheritance. +unset NVIDIA_API_KEY @@ -openshell sandbox create --from "$BUILD_CTX/Dockerfile" --name "$SANDBOX_NAME" \ +env -u NVIDIA_API_KEY openshell sandbox create --from "$BUILD_CTX/Dockerfile" --name "$SANDBOX_NAME" \ --provider nvidia-nim \ >"$CREATE_LOG" 2>&1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/setup.sh` around lines 198 - 202, The openshell sandbox creation call (openshell sandbox create ... >"$CREATE_LOG" 2>&1) can inherit the exported NVIDIA_API_KEY from the parent shell; before invoking openshell (or as a prefix), remove that env var to avoid leaking credentials — either run unset NVIDIA_API_KEY in the script just before the openshell sandbox create invocation or invoke the command with env -u NVIDIA_API_KEY to run it with that variable removed; update the section around the SANDBOX_NAME/CREATE_LOG invocation accordingly.

cv

Three questions:

setupInference still passes --credential NVIDIA_API_KEY (name-only form) on line 832. In this form, openshell reads the value from process.env.NVIDIA_API_KEY internally — the value doesn't appear in ps aux. But what does openshell provider add do with it? If it writes the key to ~/.openshell/providers.json (or similar) in cleartext on the sandbox filesystem, the exposure surface moved from "visible in ps aux / docker inspect" to "plaintext config file on disk." Does the storage backend encrypt or otherwise protect the credential at rest?
The delete process.env.NVIDIA_API_KEY on line 1077 runs before setupOpenclaw and setupPolicies. If either function or their child processes reads the key (even indirectly), this silently breaks them. The 6 regression tests verify code patterns but don't verify runtime behavior — the E2E test item is unchecked.
Messaging tokens (DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN, TELEGRAM_BOT_TOKEN) are still passed as sandbox env args in createSandbox (lines 536+). Acknowledged in the PR as blocked on #617, but these tokens have the same docker inspect / ps aux exposure this PR addresses for NVIDIA_API_KEY. Noting for tracking purposes.

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

cv approved these changes Mar 23, 2026

View reviewed changes

cv mentioned this pull request Mar 23, 2026

fix: standardize messaging integrations with host-side bridge framework #617

Open

16 tasks

ericksoa mentioned this pull request Mar 23, 2026

chore: unify vitest workspace and replace husky with prek #673

Merged

5 tasks

wscurran added the security Something isn't secure label Mar 23, 2026

cv force-pushed the fix/issue-429-credential-environ-exposure branch from f166577 to 123b492 Compare March 23, 2026 21:00

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

cv reviewed Mar 23, 2026

View reviewed changes

This was referenced Mar 24, 2026

fix(security): isolate API keys to specific child processes to prevent env pollution #172

Open

feat: expand provider onboarding and validation #648

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): stop passing NVIDIA_API_KEY into sandbox and command lines#675

fix(security): stop passing NVIDIA_API_KEY into sandbox and command lines#675
ericksoa wants to merge 1 commit intomainfrom
fix/issue-429-credential-environ-exposure

ericksoa commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

cv left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 23, 2026

Uh oh!

cv left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ericksoa commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this does NOT fix

Why it works

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

cv left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

cv left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ericksoa commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading