Skip to content

feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0)#56

Merged
nyo16 merged 1 commit into
masterfrom
vertex-gemini-improvements
May 10, 2026
Merged

feat: expand Vertex/Gemini support — thinking, structured output, tools, caching (0.16.0)#56
nyo16 merged 1 commit into
masterfrom
vertex-gemini-improvements

Conversation

@nyo16

@nyo16 nyo16 commented May 10, 2026

Copy link
Copy Markdown
Owner

Summary

Bundle of Gemini-on-Vertex improvements. Most of the new surface lands as helpers in Nous.Messages.Gemini wired into both Nous.Providers.VertexAI and Nous.Providers.Gemini, so anything new works
against either entry point.

Added

  • Thinking config (request-side). :thinking_configgenerationConfig.thinkingConfig. Accepts Elixir shape (%{thinking_budget: 1024, include_thoughts: true}) or native Vertex shape.
  • thoughtSignature round-trip on tool calls. Required for multi-turn thinking + tool loops on Gemini 2.5/3.x. Preserved in tool_call["metadata"]["thought_signature"] and echoed back when serializing
    assistant turns. Stream normalizer also propagates it on :tool_call_delta.
  • Structured output. :json_response / :json_schemaresponseMimeType / responseSchema. Cross-provider :response_format shapes (%{type: :json_schema, schema: ...} and %{type: :json_object}) map through.
  • Safety settings. :safety_settings → top-level safetySettings.
  • Tool config / tool choice. :tool_config (raw map) or :tool_choice (:auto / :any / :required / :none / {:any, [\"name\"]}) → top-level toolConfig.
  • Function calling actually works on Vertex/Gemini. New Nous.ToolSchema.to_gemini/1 emits proper functionDeclarations shape (strips OpenAI's strict and unsupported additionalProperties).
    Previously the high-level Nous.LLM path silently dropped tools for these providers.
  • Native Vertex tools. :native_tools accepts :google_search, :url_context, :code_execution, plus {tool, config} tuples and raw maps. Combined with function declarations in the same tools
    array.
  • Context caching. :cached_content → top-level cachedContent (pass-through).
  • Streaming + tools. Nous.LLM.stream_text/3 now honors :tools. Tool-call deltas (with thoughtSignature) aggregate per turn, tools execute between turns, conversation continues until the model
    stops calling tools or hits @max_tool_iterations. Text deltas are still yielded as produced.
  • More generationConfig fields: topK, seed, candidateCount, presencePenalty, frequencyPenalty, responseModalities.

Changed

  • Single timeout source of truth. Removed @streaming_timeout (300s on Vertex, 120s on Gemini). Streaming and non-streaming share one provider default; the actual timeout is model.receive_timeout,
    flowing through build_provider_opts/1.

Version

  • mix.exs: 0.15.8 → 0.16.0
  • CHANGELOG entry under [0.16.0] - 2026-05-10.

Test plan

  • mix test — 1709 tests, 0 failures (101 excluded :llm tag)
  • mix compile --warnings-as-errors clean
  • mix format --check-formatted clean
  • Manual smoke against real Vertex w/ service account: thinking + tool round-trip on gemini-2.5-pro
  • Manual smoke: JSON schema response on gemini-2.0-flash
  • Manual smoke: :google_search returns grounded citations
  • Manual smoke: streaming with tools fires both, completes loop
  • Manual smoke: receive_timeout: 600_000 on a long thinking call

Notes

  • :tools for Vertex/Gemini was effectively broken in Nous.LLM before this PR (the high-level path used the OpenAI tool schema and the providers never wrote a tools field). After this PR, tool calling
    on Gemini-on-Vertex works through the same Nous.LLM.generate_text/3 and stream_text/3 API as the other providers.
  • Helpers added on Nous.Messages.Gemini are public so users can consume them directly if they're calling Nous.Providers.{VertexAI,Gemini}.chat/2 at the low level.

…ls, caching (0.16.0)

Bundle of Gemini-on-Vertex improvements. Most surface lands as helpers in
Nous.Messages.Gemini wired into both Nous.Providers.VertexAI and
Nous.Providers.Gemini, so anything new works against either entry point.

- Thinking config (request-side) via :thinking_config (snake_case or
  Vertex camelCase), and thoughtSignature round-trip on tool calls so
  multi-turn thinking + tool loops keep working on Gemini 2.5/3.x.
- Structured output: :json_response and :json_schema map to
  responseMimeType / responseSchema; :response_format also flows through.
- :safety_settings → top-level safetySettings.
- :tool_config / :tool_choice (with :auto / :any / :required / :none /
  {:any, names} friendly forms) → top-level toolConfig.
- Function calling on Vertex/Gemini now works through the high-level
  Nous.LLM path: ToolSchema.to_gemini/1 emits proper functionDeclarations
  (strips OpenAI's "strict" + unsupported "additionalProperties").
- :native_tools accepts :google_search, :url_context, :code_execution
  (plus {tool, config} tuples / raw maps).
- :cached_content passes through as top-level cachedContent.
- Nous.LLM.stream_text/3 now honors :tools — tool_call_deltas (incl.
  thoughtSignature) aggregate per turn, tools execute, conversation
  continues until the model stops calling tools or hits max iterations.
- More generationConfig fields: topK, seed, candidateCount,
  presencePenalty, frequencyPenalty, responseModalities.
- Single timeout source of truth: removed the separate streaming-only
  defaults in both providers; receive_timeout flows uniformly through
  build_provider_opts/1.
@nyo16 nyo16 merged commit 6cda604 into master May 10, 2026
6 checks passed
@nyo16 nyo16 deleted the vertex-gemini-improvements branch May 10, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant