Conversation
…espects no-action decisions
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @saakshisinghal14, fantastic fix! I actually ran into this exact framework limitation last week while building my Information Cascade benchmark (PR #251). My financial agents were hallucinating Following the review guidelines, this is a massive improvement for the big picture of agent autonomy. Shifting I also took a quick look at the downstream impact. Since Incredibly elegant and necessary fix for production-ready deterministic simulations. Great work! |
|
Thanks @ZhehaoZhao423 ! Good to know the downstream |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #286 +/- ##
==========================================
+ Coverage 91.55% 91.88% +0.33%
==========================================
Files 19 19
Lines 1645 1627 -18
==========================================
- Hits 1506 1495 -11
+ Misses 139 132 -7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the contribution and the discussion here. I pushed some follow-up commits / fixes:
Thanks again for the contribution. I'm merging this now. |
|
Thanks @wang-boyu for the detailed follow-up changes and improvements! I really like the alignment across CoT/ReAct/ReWOO and the configurable tool_calls design — makes the execution flow much cleaner and more consistent. Glad I could contribute to this fix. Looking forward to building on this further! |
Fixes #253
Summary
When CoT reasoning concludes that no action should be taken (e.g., "I choose NOT to call adopt_opinion"), the executor phase still forces a tool call because
tool_choice="required". This causes the LLM to pick an arbitrary tool — often an unrelated one likemove_one_step— contradicting its own reasoning.The same issue exists in the base
Reasoning.execute_tool_call()andaexecute_tool_call()methods used by ReAct and ReWOO.Changes
tool_choice="required"totool_choice="auto"in all four executor call sites (CoT sync/async, base Reasoning sync/async)tool_choicevalueFiles changed
mesa_llm/reasoning/cot.py—plan()andaplan()executor phasesmesa_llm/reasoning/reasoning.py—execute_tool_call()andaexecute_tool_call()tests/test_reasoning/test_reasoning.py— updated assertion fortool_choiceTests
All 21 reasoning tests pass: