feat(llm): add native Databricks Unity AI Gateway provider#3286
feat(llm): add native Databricks Unity AI Gateway provider#3286prasadkona wants to merge 23 commits into
Conversation
| authoritative reference for credential handling. | ||
|
|
||
| See the companion skill ``databricks-ai-gateway-fm-apis`` (in ``_local/skills``) | ||
| for the routing table, worked examples, and a runnable ``probe.py`` that |
There was a problem hiding this comment.
I don't think this exists? What is the referenced skill?
| "model": ("llm_model",), | ||
| "api_key": ("llm_api_key",), | ||
| "base_url": ("llm_base_url",), | ||
| # OpenHands web app stores the Databricks workspace URL in llm_base_url. |
There was a problem hiding this comment.
I suggest to not rely on what the web app is doing; first, we are just reworking it, and second, from the point of view of architectural thinking, in the sdk we need to support all client applications, not rely on what one client app does.
enyst
left a comment
There was a problem hiding this comment.
Thank you for the contribution, @prasadkona, I find it very interesting that your description points out the four major LLM APIs. Recently I’ve rewritten from scratch LLM API clients elsewhere (not in this repo), and I found that indeed working separately with them may be a more flexible and less error prone approach than litellm’s attempt to convert anything to openai-compatible. But that’s just me.
In OpenHands, we’ve had litellm since forever, and I’m not sure we are ready to add another generic provider. Just as a heads up, this is a complicated proposal.
I realize this is a draft, no hurry, just for the record, I would love to know more about the differences and why would Databricks be added. Thank you for the work on it!
Adds DatabricksLLM — a native provider for the Databricks AI Gateway that bypasses LiteLLM and routes directly to the correct per-family endpoint: - Anthropic Claude → /anthropic/v1/messages - Google Gemini → /gemini/v1/generateContent - OpenAI GPT-5+ → /openai/v1/responses (gpt-\d routing rule) - All others → /mlflow/v1/chat/completions Auth: PAT, M2M (service-principal), CLI profile, and U2M (browser SSO via databricks-sdk). All auth strategies resolve credentials lazily so saving settings succeeds before the optional databricks-sdk package is installed. Base class changes are minimal and PR-friendly: - `LLM`: slim 15-line dispatch validator (generic subclass discovery, no hardcoded names); no new fields on the base class - `AgentBase.llm` + `LLMSummarizingCondenser.llm`: `SerializeAsAny` annotation so DatabricksLLM fields survive agent save/load round-trips - `model_features.py`: early-return guard for `databricks/` prefix - `__init__.py`: additive `create_llm` factory Includes 275 unit tests covering auth, client, discovery, routing, native API translation (multi-turn tool calls, Responses API format), resilience, and settings bridge.
…ator The safety_settings field was removed from LLM. The @field_validator for it needs check_fields=False to avoid a Pydantic startup error when the field no longer exists in the model.
…ate kwarg When DatabricksLLM is constructed with stream=True the base LLM.completion() passes stream=True through **kwargs in addition to enable_streaming. Pop it before forwarding to DatabricksFMAPIClient.chat_completion() to prevent the 'multiple values for keyword argument stream' TypeError.
… forwarding extra_headers and extra_body are litellm-specific conventions that the base LLM class injects into call kwargs. DatabricksLLM._transport_call previously forwarded these via **kwargs into DatabricksFMAPIClient.chat_completion(), which serialised them as JSON body fields — causing HTTP 400 errors from the AI Gateway (e.g. gpt-5-mini: "Unknown parameter: 'extra_headers'"). Fix: explicitly pop extra_headers, extra_body, and stream from kwargs before forwarding to chat_completion(). stream was already popped; this commit extends the strip list to cover the two new offenders. Tests: two new unit tests verify the strip at both the _transport_call layer (test_llm.py) and the client layer (test_client.py). Full Databricks suite: 278 passed.
DatabricksLLM.databricks_client_secret is a SecretStr field that was not
registered in LLM_SECRET_FIELDS, so the base _serialize_secrets field
serializer never fired for it. On AgentStore.save() the field was written as
"**********" to agent_settings.json; on reload that masked string was sent
to the Databricks OIDC /v1/token endpoint, causing a 401 on every M2M session
restart.
Fix: add a dedicated @field_serializer("databricks_client_secret") on
DatabricksLLM that delegates to serialize_secret() and converts any returned
SecretStr to the REDACTED_SECRET_VALUE string (avoiding Pydantic warnings).
When AgentStore.save() passes context={"expose_secrets": True} the plaintext
value is written correctly and round-trips through model_validate_json().
Adds test_m2m_client_secret_serialized_as_plaintext_with_expose_secrets to
cover the redact / plaintext / round-trip paths.
str.replace('Bearer ', '') replaces ALL occurrences — safe in practice since
tokens never contain that string, but split(' ', 1)[1] is more idiomatic and
defensive. Applies to both PROFILE and UNIFIED auth strategy get_token() closures.
- Eagerly import DatabricksLLM in sdk/llm/__init__.py so it registers with LLM.__subclasses__() at module-load time. This allows the agent server's _dispatch_to_provider_subclass validator to reconstruct a DatabricksLLM from serialized JSON (provider="databricks") without requiring an explicit import in the agent server process. - Add public close() method to DatabricksLLM to avoid reaching into the private _db_client attribute from callers. - Standardize USER_AGENT to "OpenHandsOSS/<version>" in utils.py. - Add databricks_host alias in UserInfoAliases (settings_bridge.py) so llm_base_url from user settings correctly populates databricks_host when constructing DatabricksLLM kwargs. - Add context/skills compatibility shims (__init__.py, skill.py, utils.py) re-exporting Skill-related symbols that moved within the SDK, preventing ImportError in the agent server subprocess.
test_user_agent_format previously asserted startswith("openhands_oss/")
which broke when the constant was renamed to "OpenHandsOSS/<version>".
Update assertion to match the new canonical product name.
Also update the discovery test docstring that referenced the old prefix.
…elds to DatabricksLLM These fields store the custom OAuth app credentials used in the U2M PKCE browser flow. Previously they only existed in the CLI's SettingsFormData and were lost after the first PKCE sign-in because kwargs_from_settings only extracts _BRIDGE_FIELDS. Adding them to DatabricksLLM allows them to: - Survive round-trips through model_dump_json / model_validate_json (agent settings) - Be preserved when rebuilding the LLM after PKCE token exchange - Be read back by the settings UI so the auth method shows as U2M on re-open
Add databricks_u2m_client_secret as a SecretStr field on DatabricksLLM with a matching field_serializer (mirrors databricks_client_secret for M2M). Add it to _BRIDGE_FIELDS so kwargs_from_settings passes it through to create_llm. Without this, the U2M client secret was never written to agent_settings.json; every CLI restart cleared the field, causing the PKCE token exchange to fail with 401 Unauthorized for confidential apps.
…arding - auth.py: _resolve_u2m accepts optional client_secret and includes it in refresh-token requests for confidential OAuth apps. resolve_credentials forwards databricks_u2m_client_secret to _resolve_u2m. - llm.py: add field_validator for databricks_u2m_client_secret that calls validate_secret() to coerce str→SecretStr and discard redacted placeholders. - settings_bridge.py: add databricks_u2m_client_secret to _SECRET_FIELDS so it is coerced to SecretStr and never logged in plaintext. - test_auth.py: add _make_mock_llm databricks_u2m_client_secret param; new tests for confidential-client refresh and resolve_credentials forwarding.
…2026 FMAPI discovery.py — CURATED_DATABRICKS_MODELS: - Claude: claude-sonnet-4-6 (new recommended), keep 4-5/haiku-4-5; add opus-4-7, opus-4-5 (current flagships); keep opus-4-1 - GPT-5: gpt-5-mini stays recommended; add gpt-5-5-pro, gpt-5-5, gpt-5-4, gpt-5-4-mini; keep gpt-5 and gpt-oss-120b - Gemini: gemini-3-5-flash (new recommended); add gemini-3-flash, gemini-3-pro; keep gemini-2-5-flash/pro llm.py — DATABRICKS_CONTEXT_WINDOWS / DATABRICKS_MAX_OUTPUT: - Remove stale pre-Claude-4 entries (claude-3-5-sonnet-2, claude-3-7-sonnet, dbrx-instruct, mixtral-8x7b, llama-3-1-70b) - Rename meta-llama-4-maverick → llama-4-maverick (matches FMAPI docs) - Add full GPT-5 codex/numbered variant line (5-1 through 5-5-pro) - Add Gemini 3 series (gemini-3-flash, 3-5-flash, 3-pro, 3-1-pro, 3-1-flash-lite) - Add Qwen/Gemma/Llama-3-1-8b entries
conversation_error.py:
- Add intelligent user-facing hints for Databricks-specific errors:
- [404] AI Gateway endpoint does not exist → endpoint name / gateway
URL mismatch guidance
- [401] UNAUTHORIZED → token expired / wrong workspace guidance
- [429] RATE_LIMIT_EXCEEDED → quota / retry guidance
- [403] Invalid access to Org → cross-geography model serving note
with recommendation to use Refresh Models and pick a supported model
- Hints are surfaced in the ConversationErrorEvent.visualize property
discovery.py:
- Remove databricks-gemini-3-flash and databricks-gemini-3-pro from
CURATED_DATABRICKS_MODELS; these require cross-geography routing not
available in all workspaces and cause confusing 403 errors
- Add context-window and max-output metadata for all verified models
|
[Automatic Post]: It has been a while since there was any activity on this PR. @prasadkona, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. This comment was created by an AI agent (OpenHands) on behalf of the user. |
…rors - Add missing required_positional_arg to ModelFeatures.__init__ to match updated signature (fixes TypeError on CLI startup) - Handle PermissionError gracefully when iterating workspace directory in find_third_party_files (fixes crash on macOS TCC-restricted paths) - Update Databricks provider utils for correct base_url resolution
fdfe2e1 to
fc0f735
Compare
Restore skills modules and the safety_settings deprecation validator to upstream main — these were fork-drift artifacts unrelated to the Databricks provider. The connector does not depend on them.
Consolidate the Authorization Code + PKCE browser-login primitives into a single SDK module so the OpenHands web app and CLI no longer maintain separate copies. Provides generate_pkce, build_authorize_url, and both sync and async code-for-token exchange helpers, exported from the databricks provider package. Bumps SDK to 1.27.0.
Remove obsolete model entries from family-detection parametrizations and update curated-model assertions to current Foundation Model API names.
…PKCE helpers Bring the provider README in sync with the code: add pkce.py and settings_bridge.py to the module-layout table, note the __init__ now exports the PKCE helpers, and add an Authentication paragraph describing the shared U2M browser-login helpers (generate_pkce / build_authorize_url / exchange_code_for_tokens) consumed by both the web backend and the CLI.
Add an Alignment with ucode section to the provider README: the connector's PROFILE/UNIFIED/U2M strategies let an OpenHands agent reach AI Gateway the same key-free, governed, workspace-credential way Databricks ucode does.
Summary
Adds a native, optimized provider (
DatabricksLLM) for the Databricks Unity AIGateway, giving OpenHands agents governed access to the foundation models served
through a customer's Databricks workspace. The connector is Databricks PWAF
(Partner Well-Architected Framework) compliant.
This is the foundational PR. The companion CLI PR (OpenHands/OpenHands-CLI#740)
and web-app PR (OpenHands/OpenHands#14449) depend on this being merged first.
Motivation
Databricks customers access foundation models through the Unity AI Gateway,
which provides a single governed entry point — unified authentication, access
control, usage tracking, and policy enforcement — across multiple model families.
This connector lets OpenHands talk to that gateway natively, with full control over
the connection lifecycle, retry/backoff against the gateway's error contract, and
the distinct native API surfaces that different model families expose (OpenAI Chat,
Anthropic Messages, Gemini generateContent, OpenAI Responses). The result is an
optimized, governance-aware path to Databricks-served foundation models.
What's new
New package:
openhands/sdk/llm/providers/databricks/llm.pyDatabricksLLM— Pydantic subclass ofLLMclient.pynative.pymodels.pyProviderFamily, gateway path routing,StoredU2MTokensauth.pyresolve_credentials(), token providersdiscovery.pylist_chat_endpoints/list_foundation_modelswith TTL cacheutils.pyUnity AI Gateway model families
OPENAIllm/v1/chatendpoints (Llama, GPT-OSS, …)OPENAI_RESPONSESdatabricks-gpt-5*)ANTHROPIC*claude*)GEMINI*gemini*)Routing is name-pattern by default (no extra API call). An opt-in
databricks_metadata_probe=Truemode performs an authoritative serving-endpointlookup (5-minute TTL) for external-model endpoints.
Auth strategies (all governed through the workspace)
~/.databrickscfg(requiresdatabricks-sdkoptional dep)databricks-sdkauth chain (workload identity, Azure AD, …)Changes to existing files (minimal wiring only)
sdk/__init__.py: routesdatabricks/model IDs to the native provider andeverything else to the base
LLM.sdk/llm/llm.py: adds a model validator so that serialized agents containing"provider": "databricks"rehydrate asDatabricksLLM. Purely additive — noexisting logic is changed.
Tests
tests/sdk/llm/providers/databricks/— 279 unit testsGateway model families and all 5 auth strategies
Test plan
uv run pytest tests/sdk/llm/providers/databricks/ -q— 279 passeddatabricks/...model ID resolves to the native Databricks providerLLMAlignment with Databricks
ucodeThis integration follows the same credential model as
Databricks
ucode— the Unity AI GatewayCoding CLI that launches coding agents through the Databricks AI Gateway using
workspace credentials, no API keys required. The connector's
PROFILEandUNIFIEDstrategies read the workspace login a developer has already established(
databricks auth login/~/.databrickscfg), andU2Mprovides interactivebrowser OAuth — so an OpenHands agent can reach AI Gateway the same key-free,
governed way
ucodedoes, reusing the existing workspace session rather than aseparate token. The result is one consistent, governed path to AI Gateway (and
the Unity Catalog–governed resources behind it) across
ucodeand OpenHands.