Detects structural prompt weaknesses before they affect production outputs — improving the quality, consistency, and reliability of every future generation across the pipeline.
consulting quality-assurance evaluation-framework mbb claude mckinsey bcg rubric novelty-detection ai-agent anthropic llm-evaluation hallucination-detection llm-as-judge claude-code agent-evaluation strategy-ai claude-skill sycophancy-detection anthropic-skill
-
Updated
May 22, 2026 - Python