Skip to content

0xelitesystem/agent-failure-modes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-failure-modes

The recurring ways LLM agents fail in production, with detection signals and mitigations for each.

This is for people shipping agents that use tools, run loops, or plan multi-step actions. Not a list of safety risks, not a catalog of jailbreaks. The patterns here are everyday operational failures that cost real users time and money.

Contents

Pattern File
Tool-call loop patterns/01-tool-loop.md
Planning collapse patterns/02-planning-collapse.md
Instruction drift patterns/03-instruction-drift.md
Sandbagging patterns/04-sandbagging.md
Premature completion patterns/05-premature-completion.md
Tool selection error patterns/06-tool-selection-error.md
Argument hallucination patterns/07-argument-hallucination.md
Error masking patterns/08-error-masking.md
Context window exhaustion patterns/09-context-exhaustion.md
User confusion as agent confusion patterns/10-user-confusion-amplification.md

How to use this

When an agent task fails:

  1. Read the agent's full trace (every tool call, every response)
  2. Find the step where things went wrong
  3. Match the pattern from the list above
  4. Apply the mitigation from that pattern's file

Most production agent failures match one of these 10 patterns. The remaining 10 to 20% are domain-specific.

What this is not

  • Not a safety guide. These are operational failures, not alignment issues.
  • Not a benchmark. There is no scoring system here; agents fail in ways specific to the tools they have access to.
  • Not exhaustive. New failure modes will appear as agents get more autonomous; this list will grow.

A note on scope

Most patterns assume an agent loop that calls tools and processes results. Single-shot LLM use without tool calling has its own failure modes (covered in rag-evaluation-rubrics).

Multi-agent systems (one LLM coordinating multiple LLM workers) have an additional set of failures around message passing and task decomposition; not covered here.

Contributing

See CONTRIBUTING.md. New patterns must include detection signals and at least one mitigation strategy.

License

MIT.

Related

About

agent-failure-modes

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors