AI/LLM test strategy for an e-commerce product recommendation engine — prompt regression, hallucination detection, toxicity safety gate, latency SLOs, and contract testing
-
Updated
May 25, 2026 - Python
AI/LLM test strategy for an e-commerce product recommendation engine — prompt regression, hallucination detection, toxicity safety gate, latency SLOs, and contract testing
Portfolio-grade AI quality evaluation lab with golden datasets, prompt regression, groundedness checks, hallucination tests and CI thresholds.
Catch prompt regressions from model drift — on a schedule, not just on PRs.
PromptOps — Evaluate, improve, test, and run your prompts in Claude Code. Score prompts 1–5, auto-improve with guardrails, regression test against golden datasets, benchmark costs across models, and execute — all in one session.
Offline prompt regression CI checks for OpenAI-compatible gateways, model routes, JSON output, and tool-call readiness.
AI agent prompt regression test template mirror for Codex, Claude Code, Cursor teams. Routes to $203 Agent Ops team license.
Add a description, image, and links to the prompt-regression topic page so that developers can more easily learn about it.
To associate your repository with the prompt-regression topic, visit your repo's landing page and select "manage topics."