Prerequisites
Feature Description
I would like to propose a feature enhancement for our WebUI: the ability to dynamically change a model's reasoning effort level on a per-message basis at runtime.
Currently, for models like GPT OSS that support it, we can pass a reasoning_effort parameter (e.g., low, medium, high) via kwargs or directly in the API request.
Proposed Solution:
Add a dedicated button or dropdown selector within the message input pane. This would allow users to select the reasoning effort level (e.g., Low, Medium, High, Off, On) for each new message they send, without needing to alter the overall session configuration.
Extended Use Case & Models:
This functionality would be highly valuable for other models that support similar runtime parameters, such as: Qwen3 ,NVIDIA Nemotron toggling reasoning.
This feature would provide greater flexibility and control during conversations, enabling users to optimize for speed or depth as needed for each query.
Motivation
It is already implemented in Cherry Studio and some other clients.
Possible Implementation
No response
Prerequisites
Feature Description
I would like to propose a feature enhancement for our WebUI: the ability to dynamically change a model's reasoning effort level on a per-message basis at runtime.
Currently, for models like GPT OSS that support it, we can pass a reasoning_effort parameter (e.g., low, medium, high) via kwargs or directly in the API request.
Proposed Solution:
Add a dedicated button or dropdown selector within the message input pane. This would allow users to select the reasoning effort level (e.g., Low, Medium, High, Off, On) for each new message they send, without needing to alter the overall session configuration.
Extended Use Case & Models:
This functionality would be highly valuable for other models that support similar runtime parameters, such as: Qwen3 ,NVIDIA Nemotron toggling reasoning.
This feature would provide greater flexibility and control during conversations, enabling users to optimize for speed or depth as needed for each query.
Motivation
It is already implemented in Cherry Studio and some other clients.
Possible Implementation
No response