feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0) by nyo16 · Pull Request #56 · nyo16/nous

nyo16 · 2026-05-10T11:19:09Z

Summary

Bundle of Gemini-on-Vertex improvements. Most of the new surface lands as helpers in Nous.Messages.Gemini wired into both Nous.Providers.VertexAI and Nous.Providers.Gemini, so anything new works
against either entry point.

Added

Thinking config (request-side). :thinking_config → generationConfig.thinkingConfig. Accepts Elixir shape (%{thinking_budget: 1024, include_thoughts: true}) or native Vertex shape.
thoughtSignature round-trip on tool calls. Required for multi-turn thinking + tool loops on Gemini 2.5/3.x. Preserved in tool_call["metadata"]["thought_signature"] and echoed back when serializing
assistant turns. Stream normalizer also propagates it on :tool_call_delta.
Structured output. :json_response / :json_schema → responseMimeType / responseSchema. Cross-provider :response_format shapes (%{type: :json_schema, schema: ...} and %{type: :json_object}) map through.
Safety settings. :safety_settings → top-level safetySettings.
Tool config / tool choice. :tool_config (raw map) or :tool_choice (:auto / :any / :required / :none / {:any, [\"name\"]}) → top-level toolConfig.
Function calling actually works on Vertex/Gemini. New Nous.ToolSchema.to_gemini/1 emits proper functionDeclarations shape (strips OpenAI's strict and unsupported additionalProperties).
Previously the high-level Nous.LLM path silently dropped tools for these providers.
Native Vertex tools. :native_tools accepts :google_search, :url_context, :code_execution, plus {tool, config} tuples and raw maps. Combined with function declarations in the same tools
array.
Context caching. :cached_content → top-level cachedContent (pass-through).
Streaming + tools. Nous.LLM.stream_text/3 now honors :tools. Tool-call deltas (with thoughtSignature) aggregate per turn, tools execute between turns, conversation continues until the model
stops calling tools or hits @max_tool_iterations. Text deltas are still yielded as produced.
More generationConfig fields: topK, seed, candidateCount, presencePenalty, frequencyPenalty, responseModalities.

Changed

Single timeout source of truth. Removed @streaming_timeout (300s on Vertex, 120s on Gemini). Streaming and non-streaming share one provider default; the actual timeout is model.receive_timeout,
flowing through build_provider_opts/1.

Version

mix.exs: 0.15.8 → 0.16.0
CHANGELOG entry under [0.16.0] - 2026-05-10.

Test plan

mix test — 1709 tests, 0 failures (101 excluded :llm tag)
mix compile --warnings-as-errors clean
mix format --check-formatted clean
Manual smoke against real Vertex w/ service account: thinking + tool round-trip on gemini-2.5-pro
Manual smoke: JSON schema response on gemini-2.0-flash
Manual smoke: :google_search returns grounded citations
Manual smoke: streaming with tools fires both, completes loop
Manual smoke: receive_timeout: 600_000 on a long thinking call

Notes

:tools for Vertex/Gemini was effectively broken in Nous.LLM before this PR (the high-level path used the OpenAI tool schema and the providers never wrote a tools field). After this PR, tool calling
on Gemini-on-Vertex works through the same Nous.LLM.generate_text/3 and stream_text/3 API as the other providers.
Helpers added on Nous.Messages.Gemini are public so users can consume them directly if they're calling Nous.Providers.{VertexAI,Gemini}.chat/2 at the low level.

…ls, caching (0.16.0) Bundle of Gemini-on-Vertex improvements. Most surface lands as helpers in Nous.Messages.Gemini wired into both Nous.Providers.VertexAI and Nous.Providers.Gemini, so anything new works against either entry point. - Thinking config (request-side) via :thinking_config (snake_case or Vertex camelCase), and thoughtSignature round-trip on tool calls so multi-turn thinking + tool loops keep working on Gemini 2.5/3.x. - Structured output: :json_response and :json_schema map to responseMimeType / responseSchema; :response_format also flows through. - :safety_settings → top-level safetySettings. - :tool_config / :tool_choice (with :auto / :any / :required / :none / {:any, names} friendly forms) → top-level toolConfig. - Function calling on Vertex/Gemini now works through the high-level Nous.LLM path: ToolSchema.to_gemini/1 emits proper functionDeclarations (strips OpenAI's "strict" + unsupported "additionalProperties"). - :native_tools accepts :google_search, :url_context, :code_execution (plus {tool, config} tuples / raw maps). - :cached_content passes through as top-level cachedContent. - Nous.LLM.stream_text/3 now honors :tools — tool_call_deltas (incl. thoughtSignature) aggregate per turn, tools execute, conversation continues until the model stops calling tools or hits max iterations. - More generationConfig fields: topK, seed, candidateCount, presencePenalty, frequencyPenalty, responseModalities. - Single timeout source of truth: removed the separate streaming-only defaults in both providers; receive_timeout flows uniformly through build_provider_opts/1.

nyo16 merged commit 6cda604 into master May 10, 2026
6 checks passed

nyo16 deleted the vertex-gemini-improvements branch May 10, 2026 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0)#56

feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0)#56
nyo16 merged 1 commit into
masterfrom
vertex-gemini-improvements

nyo16 commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nyo16 commented May 10, 2026

Summary

Added

Changed

Version

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant