Skip to content

chore: adding more common failure modes to AGENTS.md#548

Draft
dionhaefner wants to merge 1 commit intomainfrom
dion/more-agents
Draft

chore: adding more common failure modes to AGENTS.md#548
dionhaefner wants to merge 1 commit intomainfrom
dion/more-agents

Conversation

@dionhaefner
Copy link
Copy Markdown
Contributor

Relevant issue or PR

Description of changes

Testing done

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.92%. Comparing base (fcff6c9) to head (63a2366).

❗ There is a different number of reports uploaded between BASE (fcff6c9) and HEAD (63a2366). Click for more details.

HEAD has 27 uploads less than BASE
Flag BASE (fcff6c9) HEAD (63a2366)
34 7
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #548      +/-   ##
==========================================
- Coverage   77.31%   67.92%   -9.39%     
==========================================
  Files          32       32              
  Lines        4381     4381              
  Branches      723      723              
==========================================
- Hits         3387     2976     -411     
- Misses        701     1166     +465     
+ Partials      293      239      -54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jpbrodrick89
Copy link
Copy Markdown
Contributor

Hmm, I'd be a bit more careful here; yes, token management is now less of an issue with 1M windows but a suggestion that could be interpreted as "read the whole codebase" could easily lead to context pollution. I'd probably keep the bit about don't make assumptions, but maybe say verify any speculation against the codebase rather than the current wording.

Similarly, end to end tests take a long time and often times you just need to run a subset when iterating fast. Maybe try and keep the idea that a piece of work isn't complete until running the end to end suite and that confidence should be subdued until this has been done. Not sure what's best, maybe a suggestion to ask the user whether to run the whole test suite on every iteration or just a subset until told otherwise.

@dionhaefner
Copy link
Copy Markdown
Contributor Author

Appreciate the input but it neither says "whole codebase" nor "whole test suite" :)

This is a draft so I'll do more testing but so far it's been helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants