feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0)#56
Merged
Merged
Conversation
…ls, caching (0.16.0)
Bundle of Gemini-on-Vertex improvements. Most surface lands as helpers in
Nous.Messages.Gemini wired into both Nous.Providers.VertexAI and
Nous.Providers.Gemini, so anything new works against either entry point.
- Thinking config (request-side) via :thinking_config (snake_case or
Vertex camelCase), and thoughtSignature round-trip on tool calls so
multi-turn thinking + tool loops keep working on Gemini 2.5/3.x.
- Structured output: :json_response and :json_schema map to
responseMimeType / responseSchema; :response_format also flows through.
- :safety_settings → top-level safetySettings.
- :tool_config / :tool_choice (with :auto / :any / :required / :none /
{:any, names} friendly forms) → top-level toolConfig.
- Function calling on Vertex/Gemini now works through the high-level
Nous.LLM path: ToolSchema.to_gemini/1 emits proper functionDeclarations
(strips OpenAI's "strict" + unsupported "additionalProperties").
- :native_tools accepts :google_search, :url_context, :code_execution
(plus {tool, config} tuples / raw maps).
- :cached_content passes through as top-level cachedContent.
- Nous.LLM.stream_text/3 now honors :tools — tool_call_deltas (incl.
thoughtSignature) aggregate per turn, tools execute, conversation
continues until the model stops calling tools or hits max iterations.
- More generationConfig fields: topK, seed, candidateCount,
presencePenalty, frequencyPenalty, responseModalities.
- Single timeout source of truth: removed the separate streaming-only
defaults in both providers; receive_timeout flows uniformly through
build_provider_opts/1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundle of Gemini-on-Vertex improvements. Most of the new surface lands as helpers in
Nous.Messages.Geminiwired into bothNous.Providers.VertexAIandNous.Providers.Gemini, so anything new worksagainst either entry point.
Added
:thinking_config→generationConfig.thinkingConfig. Accepts Elixir shape (%{thinking_budget: 1024, include_thoughts: true}) or native Vertex shape.thoughtSignatureround-trip on tool calls. Required for multi-turn thinking + tool loops on Gemini 2.5/3.x. Preserved intool_call["metadata"]["thought_signature"]and echoed back when serializingassistant turns. Stream normalizer also propagates it on
:tool_call_delta.:json_response/:json_schema→responseMimeType/responseSchema. Cross-provider:response_formatshapes (%{type: :json_schema, schema: ...}and%{type: :json_object}) map through.:safety_settings→ top-levelsafetySettings.:tool_config(raw map) or:tool_choice(:auto/:any/:required/:none/{:any, [\"name\"]}) → top-leveltoolConfig.Nous.ToolSchema.to_gemini/1emits properfunctionDeclarationsshape (strips OpenAI'sstrictand unsupportedadditionalProperties).Previously the high-level
Nous.LLMpath silently dropped tools for these providers.:native_toolsaccepts:google_search,:url_context,:code_execution, plus{tool, config}tuples and raw maps. Combined with function declarations in the sametoolsarray.
:cached_content→ top-levelcachedContent(pass-through).Nous.LLM.stream_text/3now honors:tools. Tool-call deltas (withthoughtSignature) aggregate per turn, tools execute between turns, conversation continues until the modelstops calling tools or hits
@max_tool_iterations. Text deltas are still yielded as produced.topK,seed,candidateCount,presencePenalty,frequencyPenalty,responseModalities.Changed
@streaming_timeout(300s on Vertex, 120s on Gemini). Streaming and non-streaming share one provider default; the actual timeout ismodel.receive_timeout,flowing through
build_provider_opts/1.Version
mix.exs: 0.15.8 → 0.16.0[0.16.0] - 2026-05-10.Test plan
mix test— 1709 tests, 0 failures (101 excluded:llmtag)mix compile --warnings-as-errorscleanmix format --check-formattedcleangemini-2.5-progemini-2.0-flash:google_searchreturns grounded citationsreceive_timeout: 600_000on a long thinking callNotes
:toolsfor Vertex/Gemini was effectively broken inNous.LLMbefore this PR (the high-level path used the OpenAI tool schema and the providers never wrote atoolsfield). After this PR, tool callingon Gemini-on-Vertex works through the same
Nous.LLM.generate_text/3andstream_text/3API as the other providers.Nous.Messages.Geminiare public so users can consume them directly if they're callingNous.Providers.{VertexAI,Gemini}.chat/2at the low level.