Skip to content

fix: change executor tool_choice from required to auto so reasoning respects no-action decisions#286

Merged
wang-boyu merged 6 commits intomesa:mainfrom
saakshisinghal14:fix/cot-executor-respects-no-action
Apr 12, 2026
Merged

fix: change executor tool_choice from required to auto so reasoning respects no-action decisions#286
wang-boyu merged 6 commits intomesa:mainfrom
saakshisinghal14:fix/cot-executor-respects-no-action

Conversation

@saakshisinghal14
Copy link
Copy Markdown
Contributor

Fixes #253

Summary

When CoT reasoning concludes that no action should be taken (e.g., "I choose NOT to call adopt_opinion"), the executor phase still forces a tool call because tool_choice="required". This causes the LLM to pick an arbitrary tool — often an unrelated one like move_one_step — contradicting its own reasoning.

The same issue exists in the base Reasoning.execute_tool_call() and aexecute_tool_call() methods used by ReAct and ReWOO.

Changes

  • Changed tool_choice="required" to tool_choice="auto" in all four executor call sites (CoT sync/async, base Reasoning sync/async)
  • Updated the executor system prompt to explicitly instruct the LLM that it may skip tool calls when the plan concludes no action is needed
  • Updated existing test assertion to match the new tool_choice value

Files changed

  • mesa_llm/reasoning/cot.pyplan() and aplan() executor phases
  • mesa_llm/reasoning/reasoning.pyexecute_tool_call() and aexecute_tool_call()
  • tests/test_reasoning/test_reasoning.py — updated assertion for tool_choice

Tests

All 21 reasoning tests pass:

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7cd1a7ed-ff1d-4bb6-8916-187362965217

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ZhehaoZhao423
Copy link
Copy Markdown

Hi @saakshisinghal14, fantastic fix! I actually ran into this exact framework limitation last week while building my Information Cascade benchmark (PR #251). My financial agents were hallucinating move_one_step actions purely because the CoT executor was coercing them to pick a tool, even when their reasoning explicitly concluded to "HOLD" their positions.

Following the review guidelines, this is a massive improvement for the big picture of agent autonomy. Shifting tool_choice from required to auto correctly returns the decision boundary to the LLM's cognitive phase.

I also took a quick look at the downstream impact. Since tool_choice="auto" means the LLM might return responses without tool calls entirely, I checked ToolManager.call_tools / acall_tools. Fortunately, the existing getattr(llm_response, "tool_calls", []) with the if not tool_calls: check perfectly catches None or missing payloads and safely returns []. This means your PR integrates flawlessly without triggering any downstream TypeErrors.

Incredibly elegant and necessary fix for production-ready deterministic simulations. Great work!

@saakshisinghal14
Copy link
Copy Markdown
Contributor Author

Thanks @ZhehaoZhao423 ! Good to know the downstream call_tools path handles the None case cleanly — I checked the same code path but appreciate the independent confirmation. Glad this helps your Information Cascade work too.

@wang-boyu wang-boyu added reasoning bug Release notes label labels Apr 11, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.88%. Comparing base (b6ece59) to head (038d2d1).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
+ Coverage   91.55%   91.88%   +0.33%     
==========================================
  Files          19       19              
  Lines        1645     1627      -18     
==========================================
- Hits         1506     1495      -11     
+ Misses        139      132       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wang-boyu
Copy link
Copy Markdown
Member

Thanks for the contribution and the discussion here. I pushed some follow-up commits / fixes:

  • fixed the ReWOO no-op case when llm_plan.tool_calls is None
  • added a tool_calls parameter so execution-side tool_choice is configurable (default to "auto") while planning still uses tool_choice="none"
  • refactored CoT to use the shared execute_tool_call() path to align with ReAct and ReWOO
  • aligned memory handling across CoT, ReAct, and ReWOO to have consistent plan_execution logging
  • updated some docstrings and test cases
  • merged main branch to resolve conflicts

Thanks again for the contribution. I'm merging this now.

@wang-boyu wang-boyu merged commit 5ca61f7 into mesa:main Apr 12, 2026
15 checks passed
@saakshisinghal14
Copy link
Copy Markdown
Contributor Author

Thanks @wang-boyu for the detailed follow-up changes and improvements!

I really like the alignment across CoT/ReAct/ReWOO and the configurable tool_calls design — makes the execution flow much cleaner and more consistent.

Glad I could contribute to this fix. Looking forward to building on this further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Release notes label reasoning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] CoT executor ignores reasoner conclusion — calls tool despite "no action" decision

3 participants