Skip to content

Feature Request: WebUI add reasoning effort level on a per-message basis #18405

@engrtipusultan

Description

@engrtipusultan

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

I would like to propose a feature enhancement for our WebUI: the ability to dynamically change a model's reasoning effort level on a per-message basis at runtime.

Currently, for models like GPT OSS that support it, we can pass a reasoning_effort parameter (e.g., low, medium, high) via kwargs or directly in the API request.

Proposed Solution:
Add a dedicated button or dropdown selector within the message input pane. This would allow users to select the reasoning effort level (e.g., Low, Medium, High, Off, On) for each new message they send, without needing to alter the overall session configuration.

Extended Use Case & Models:
This functionality would be highly valuable for other models that support similar runtime parameters, such as: Qwen3 ,NVIDIA Nemotron toggling reasoning.

This feature would provide greater flexibility and control during conversations, enabling users to optimize for speed or depth as needed for each query.

Motivation

It is already implemented in Cherry Studio and some other clients.

Image

Possible Implementation

No response

Metadata

Metadata

Assignees

Labels

No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions