Skip to content

Commit 30723a3

Browse files
feat(onboard): add custom OpenAI-compatible provider option
Add a "Custom OpenAI-compatible endpoint" option to the onboarding wizard, allowing users to bring any provider that exposes an OpenAI-compatible /v1/chat/completions endpoint (e.g. Google Gemini via AI Studio, OpenRouter, Together AI, LiteLLM). The custom provider follows the same gateway-routed architecture as existing providers: the sandbox talks to inference.local, and the OpenShell gateway proxies to the user's endpoint with credential injection and model rewriting. Non-NVIDIA endpoints may reject OpenAI-specific parameters like "store". Set supportsStore: false in the default openclaw.json model compat to prevent 400 rejections from strict endpoints. This is safe for all providers — NVIDIA and Ollama ignore the flag. Interactive mode prompts for base URL, API key, and model name. Non-interactive mode reads NEMOCLAW_CUSTOM_BASE_URL, NEMOCLAW_CUSTOM_API_KEY, and NEMOCLAW_MODEL. Tested with Google Gemini (gemini-2.5-flash) and local Ollama (llama3.2) to verify backward compatibility.
1 parent 1dbf82f commit 30723a3

7 files changed

Lines changed: 186 additions & 10 deletions

File tree

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ config = { \
113113
'baseUrl': 'https://inference.local/v1', \
114114
'apiKey': 'unused', \
115115
'api': 'openai-completions', \
116-
'models': [{'id': model, 'name': model, 'reasoning': False, 'input': ['text'], 'cost': {'input': 0, 'output': 0, 'cacheRead': 0, 'cacheWrite': 0}, 'contextWindow': 131072, 'maxTokens': 4096}] \
116+
'models': [{'id': model, 'name': model, 'reasoning': False, 'input': ['text'], 'cost': {'input': 0, 'output': 0, 'cacheRead': 0, 'cacheWrite': 0}, 'contextWindow': 131072, 'maxTokens': 4096, 'compat': {'supportsStore': False}}] \
117117
} \
118118
}}, \
119119
'channels': {'defaults': {'configWrites': False}}, \

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -179,13 +179,16 @@ When something goes wrong, errors may originate from either NemoClaw or the Open
179179

180180
## Inference
181181

182-
Inference requests from the agent never leave the sandbox directly. OpenShell intercepts every call and routes it to the NVIDIA Endpoint provider.
182+
Inference requests from the agent never leave the sandbox directly. OpenShell intercepts every call and routes it through the gateway proxy.
183183

184184
| Provider | Model | Use Case |
185185
|--------------|--------------------------------------|-------------------------------------------------|
186186
| NVIDIA Endpoint | `nvidia/nemotron-3-super-120b-a12b` | Production. Requires an NVIDIA API key. |
187+
| Custom OpenAI-compatible | User-specified | Any provider with an OpenAI-compatible `/v1/chat/completions` endpoint. |
187188

188-
Get an API key from [build.nvidia.com](https://build.nvidia.com). The `nemoclaw onboard` command prompts for this key during setup.
189+
For the NVIDIA endpoint, get an API key from [build.nvidia.com](https://build.nvidia.com). The `nemoclaw onboard` command prompts for this key during setup.
190+
191+
For custom providers, select "Custom OpenAI-compatible endpoint" during `nemoclaw onboard` and provide the base URL, API key, and model name. Any provider that exposes an OpenAI-compatible `/v1/chat/completions` endpoint will work. For non-interactive mode, set `NEMOCLAW_PROVIDER=custom`, `NEMOCLAW_CUSTOM_BASE_URL`, `NEMOCLAW_CUSTOM_API_KEY`, and `NEMOCLAW_MODEL`.
189192

190193
Local inference options such as Ollama and vLLM are still experimental. On macOS, they also depend on OpenShell host-routing support in addition to the local service itself being reachable on the host.
191194

bin/lib/inference-config.js

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,17 @@ function getProviderSelectionConfig(provider, model) {
5151
provider,
5252
providerLabel: "Local Ollama",
5353
};
54+
case "custom":
55+
return {
56+
endpointType: "custom",
57+
endpointUrl: INFERENCE_ROUTE_URL,
58+
ncpPartner: null,
59+
model: model || null,
60+
profile: DEFAULT_ROUTE_PROFILE,
61+
credentialEnv: DEFAULT_ROUTE_CREDENTIAL_ENV,
62+
provider,
63+
providerLabel: "Custom Provider",
64+
};
5465
default:
5566
return null;
5667
}

bin/lib/onboard.js

Lines changed: 104 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ const {
2727
isUnsupportedMacosRuntime,
2828
shouldPatchCoredns,
2929
} = require("./platform");
30-
const { prompt, ensureApiKey, getCredential } = require("./credentials");
30+
const { prompt, ensureApiKey, getCredential, saveCredential } = require("./credentials");
3131
const registry = require("./registry");
3232
const nim = require("./nim");
3333
const policies = require("./policies");
@@ -209,10 +209,10 @@ function getNonInteractiveProvider() {
209209
const providerKey = (process.env.NEMOCLAW_PROVIDER || "").trim().toLowerCase();
210210
if (!providerKey) return null;
211211

212-
const validProviders = new Set(["cloud", "ollama", "vllm", "nim"]);
212+
const validProviders = new Set(["cloud", "ollama", "vllm", "nim", "custom"]);
213213
if (!validProviders.has(providerKey)) {
214214
console.error(` Unsupported NEMOCLAW_PROVIDER: ${providerKey}`);
215-
console.error(" Valid values: cloud, ollama, vllm, nim");
215+
console.error(" Valid values: cloud, ollama, vllm, nim, custom");
216216
process.exit(1);
217217
}
218218

@@ -532,6 +532,7 @@ async function setupNim(sandboxName, gpu) {
532532
let model = null;
533533
let provider = "nvidia-nim";
534534
let nimContainer = null;
535+
let customCreds = null;
535536

536537
// Detect local inference options
537538
const hasOllama = !!runCapture("command -v ollama", { ignoreError: true });
@@ -570,6 +571,8 @@ async function setupNim(sandboxName, gpu) {
570571
options.push({ key: "install-ollama", label: "Install Ollama (macOS)" });
571572
}
572573

574+
options.push({ key: "custom", label: "Custom OpenAI-compatible endpoint (bring your own)" });
575+
573576
if (options.length > 1) {
574577
let selected;
575578

@@ -681,6 +684,83 @@ async function setupNim(sandboxName, gpu) {
681684
console.log(" ✓ Using existing vLLM on localhost:8000");
682685
provider = "vllm-local";
683686
model = "vllm-local";
687+
} else if (selected.key === "custom") {
688+
provider = "custom";
689+
let customBaseUrl;
690+
let customApiKey;
691+
if (isNonInteractive()) {
692+
customBaseUrl = (process.env.NEMOCLAW_CUSTOM_BASE_URL || "").trim();
693+
customApiKey = (process.env.NEMOCLAW_CUSTOM_API_KEY || "").trim();
694+
model = requestedModel;
695+
if (!customBaseUrl || !customApiKey || !model) {
696+
console.error(" Custom provider requires NEMOCLAW_CUSTOM_BASE_URL, NEMOCLAW_CUSTOM_API_KEY, and NEMOCLAW_MODEL.");
697+
process.exit(1);
698+
}
699+
} else {
700+
console.log("");
701+
console.log(" ┌─────────────────────────────────────────────────────────────────┐");
702+
console.log(" │ Custom OpenAI-compatible provider │");
703+
console.log(" │ │");
704+
console.log(" │ Provide a base URL and API key for any provider that │");
705+
console.log(" │ exposes an OpenAI-compatible /v1/chat/completions endpoint. │");
706+
console.log(" │ │");
707+
console.log(" │ Examples: │");
708+
console.log(" │ Google Gemini https://generativelanguage.googleapis.com/v1beta/openai │");
709+
console.log(" │ OpenRouter https://openrouter.ai/api/v1 │");
710+
console.log(" │ Together AI https://api.together.xyz/v1 │");
711+
console.log(" │ LiteLLM http://localhost:4000/v1 │");
712+
console.log(" └─────────────────────────────────────────────────────────────────┘");
713+
console.log("");
714+
715+
customBaseUrl = (await prompt(" Base URL: ")).trim();
716+
if (!customBaseUrl) {
717+
console.error(" Base URL is required.");
718+
process.exit(1);
719+
}
720+
721+
const previousBaseUrl = getCredential("CUSTOM_PROVIDER_BASE_URL");
722+
saveCredential("CUSTOM_PROVIDER_BASE_URL", customBaseUrl);
723+
724+
customApiKey = previousBaseUrl === customBaseUrl
725+
? getCredential("CUSTOM_PROVIDER_API_KEY")
726+
: null;
727+
if (!customApiKey) {
728+
if (previousBaseUrl && previousBaseUrl !== customBaseUrl) {
729+
console.log(" Base URL changed — please enter a new API key.");
730+
}
731+
customApiKey = (await prompt(" API Key: ")).trim();
732+
if (!customApiKey) {
733+
console.error(" API key is required.");
734+
process.exit(1);
735+
}
736+
saveCredential("CUSTOM_PROVIDER_API_KEY", customApiKey);
737+
console.log(" Key saved to ~/.nemoclaw/credentials.json");
738+
} else {
739+
console.log(" Using saved API key from credentials.");
740+
}
741+
742+
model = await prompt(" Model name (e.g. gemini-2.5-flash): ");
743+
if (!model) {
744+
console.error(" Model name is required.");
745+
process.exit(1);
746+
}
747+
}
748+
749+
// Validate base URL
750+
try {
751+
const parsed = new URL(customBaseUrl);
752+
if (parsed.protocol === "http:" && !["localhost", "127.0.0.1", "::1"].includes(parsed.hostname)) {
753+
console.error(" Insecure http:// URLs are only allowed for localhost. Use https:// for remote endpoints.");
754+
process.exit(1);
755+
}
756+
} catch {
757+
console.error(` Invalid URL: ${customBaseUrl}`);
758+
process.exit(1);
759+
}
760+
761+
// Store credentials for setupInference to use
762+
customCreds = { baseUrl: customBaseUrl, apiKey: customApiKey };
763+
console.log(` ✓ Using custom provider with model: ${model}`);
684764
}
685765
// else: cloud — fall through to default below
686766
}
@@ -703,12 +783,12 @@ async function setupNim(sandboxName, gpu) {
703783

704784
registry.updateSandbox(sandboxName, { model, provider, nimContainer });
705785

706-
return { model, provider };
786+
return { model, provider, customCreds };
707787
}
708788

709789
// ── Step 5: Inference provider ───────────────────────────────────
710790

711-
async function setupInference(sandboxName, model, provider) {
791+
async function setupInference(sandboxName, model, provider, customCreds) {
712792
step(5, 7, "Setting up inference provider");
713793

714794
if (provider === "nvidia-nim") {
@@ -769,6 +849,22 @@ async function setupInference(sandboxName, model, provider) {
769849
console.error(` ${probe.message}`);
770850
process.exit(1);
771851
}
852+
} else if (provider === "custom") {
853+
const baseUrl = customCreds?.baseUrl || getCredential("CUSTOM_PROVIDER_BASE_URL");
854+
const apiKey = customCreds?.apiKey || getCredential("CUSTOM_PROVIDER_API_KEY");
855+
run(
856+
`openshell provider create --name custom-provider --type openai ` +
857+
`--credential ${shellQuote("OPENAI_API_KEY=" + apiKey)} ` +
858+
`--config ${shellQuote("OPENAI_BASE_URL=" + baseUrl)} 2>&1 || ` +
859+
`openshell provider update custom-provider ` +
860+
`--credential ${shellQuote("OPENAI_API_KEY=" + apiKey)} ` +
861+
`--config ${shellQuote("OPENAI_BASE_URL=" + baseUrl)} 2>&1 || true`,
862+
{ ignoreError: true }
863+
);
864+
run(
865+
`openshell inference set --no-verify --provider custom-provider --model ${shellQuote(model)} 2>/dev/null || true`,
866+
{ ignoreError: true }
867+
);
772868
}
773869

774870
registry.updateSandbox(sandboxName, { model, provider });
@@ -921,6 +1017,7 @@ function printDashboard(sandboxName, model, provider) {
9211017
if (provider === "nvidia-nim") providerLabel = "NVIDIA Endpoint API";
9221018
else if (provider === "vllm-local") providerLabel = "Local vLLM";
9231019
else if (provider === "ollama-local") providerLabel = "Local Ollama";
1020+
else if (provider === "custom") providerLabel = "Custom Provider";
9241021

9251022
console.log("");
9261023
console.log(` ${"─".repeat(50)}`);
@@ -949,8 +1046,8 @@ async function onboard(opts = {}) {
9491046
const gpu = await preflight();
9501047
await startGateway(gpu);
9511048
const sandboxName = await createSandbox(gpu);
952-
const { model, provider } = await setupNim(sandboxName, gpu);
953-
await setupInference(sandboxName, model, provider);
1049+
const { model, provider, customCreds } = await setupNim(sandboxName, gpu);
1050+
await setupInference(sandboxName, model, provider, customCreds);
9541051
await setupOpenclaw(sandboxName, model, provider);
9551052
await setupPolicies(sandboxName);
9561053
printDashboard(sandboxName, model, provider);

docs/inference/switch-inference-providers.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,44 @@ You can switch to any of these models at runtime.
6767
| `nvidia/llama-3.3-nemotron-super-49b-v1.5` | Nemotron Super 49B v1.5 | 131,072 | 4,096 |
6868
| `nvidia/nemotron-3-nano-30b-a3b` | Nemotron 3 Nano 30B | 131,072 | 4,096 |
6969

70+
## Custom OpenAI-Compatible Providers
71+
72+
You can use any provider that exposes an OpenAI-compatible `/v1/chat/completions` endpoint.
73+
74+
During `nemoclaw onboard`, select **"Custom OpenAI-compatible endpoint"** and provide:
75+
76+
- **Base URL** — the provider's API base (e.g. `https://generativelanguage.googleapis.com/v1beta/openai`)
77+
- **API key** — your provider credential
78+
- **Model name** — the model identifier (e.g. `gemini-2.5-flash`)
79+
80+
Examples of compatible providers:
81+
82+
| Provider | Base URL |
83+
|---|---|
84+
| Google AI Studio (Gemini) | `https://generativelanguage.googleapis.com/v1beta/openai` |
85+
| OpenRouter | `https://openrouter.ai/api/v1` |
86+
| Together AI | `https://api.together.xyz/v1` |
87+
| LiteLLM (local) | `http://localhost:4000/v1` |
88+
89+
To switch to a custom provider at runtime:
90+
91+
```console
92+
$ openshell provider create --name custom-provider --type openai \
93+
--credential "OPENAI_API_KEY=<your-key>" \
94+
--config "OPENAI_BASE_URL=<base-url>"
95+
$ openshell inference set --no-verify --provider custom-provider --model <model-name>
96+
```
97+
98+
For non-interactive onboarding:
99+
100+
```console
101+
$ NEMOCLAW_PROVIDER=custom \
102+
NEMOCLAW_CUSTOM_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai \
103+
NEMOCLAW_CUSTOM_API_KEY=AIza... \
104+
NEMOCLAW_MODEL=gemini-2.5-flash \
105+
nemoclaw onboard --non-interactive
106+
```
107+
70108
## Related Topics
71109

72110
- [Inference Profiles](../reference/inference-profiles.md) for full profile configuration details.

nemoclaw-blueprint/blueprint.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ profiles:
1111
- ncp
1212
- nim-local
1313
- vllm
14+
- custom
1415

1516
description: |
1617
NemoClaw blueprint: orchestrates OpenClaw sandbox creation, migration,
@@ -54,6 +55,13 @@ components:
5455
credential_env: "OPENAI_API_KEY"
5556
credential_default: "dummy"
5657

58+
custom:
59+
provider_type: "openai"
60+
provider_name: "custom-provider"
61+
endpoint: ""
62+
model: ""
63+
credential_env: "OPENAI_API_KEY"
64+
5765
policy:
5866
base: "sandboxes/openclaw/policy.yaml"
5967
additions:

test/inference-config.test.js

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,25 @@ describe("inference selection config", () => {
5656
});
5757
});
5858

59+
it("maps custom to the sandbox inference route with user-specified model", () => {
60+
assert.deepEqual(getProviderSelectionConfig("custom", "gemini-2.5-flash"), {
61+
endpointType: "custom",
62+
endpointUrl: INFERENCE_ROUTE_URL,
63+
ncpPartner: null,
64+
model: "gemini-2.5-flash",
65+
profile: DEFAULT_ROUTE_PROFILE,
66+
credentialEnv: DEFAULT_ROUTE_CREDENTIAL_ENV,
67+
provider: "custom",
68+
providerLabel: "Custom Provider",
69+
});
70+
});
71+
72+
it("returns null model for custom provider when no model specified", () => {
73+
const config = getProviderSelectionConfig("custom");
74+
assert.equal(config.model, null);
75+
assert.equal(config.providerLabel, "Custom Provider");
76+
});
77+
5978
it("builds a qualified OpenClaw primary model for ollama-local", () => {
6079
assert.equal(
6180
getOpenClawPrimaryModel("ollama-local", "nemotron-3-nano:30b"),

0 commit comments

Comments
 (0)