Harden review prompts for consistency and noise reduction by mariusvniekerk · Pull Request #579 · roborev-dev/roborev

mariusvniekerk · 2026-03-25T13:18:19Z

Summary

Define impact-based severity levels — Replace bare high/medium/low labels with concrete definitions tied to real-world impact (data loss, exploitability, blast radius). Gives all agents a shared calibration standard so severity is consistent across reviews.
Require concrete harm articulation — Every finding must now explain what specifically goes wrong if left unfixed. Eliminates vague "violates best practices" findings by forcing agents to justify each issue with concrete reasoning.
Add evidence thresholds — Explicit "do not report" instructions suppress the most common false positive categories: hypothetical issues in unseen code, style opinions, unfounded "missing tests" claims, and flagging codebase conventions as issues.
Add intent-implementation alignment check — Reverse the old "do not review the commit message" instruction. The commit message now serves as the primary lens for evaluating the diff, catching gaps between what the developer intended and what they actually wrote.
Add self-review quality gate — Before outputting, agents must verify every finding has a specific file/line reference, severity matches described impact, and no findings contradict each other. Drops findings that fail.
Add evidence thresholds to insights analysis — Tiered confidence thresholds (1-2 = data point, 3-5 = candidate, 6+ = strong recommendation) prevent guideline suggestions from single occurrences and give high confidence to well-evidenced patterns.

🤖 Generated with Claude Code

Bare "high/medium/low" labels give agents no shared calibration standard, leading to inconsistent severity across reviews. Defining each level in terms of real-world impact (data loss, exploitability, blast radius) aligns all agents on the same scale and naturally prevents low-value findings from being over-rated. Inspired by the impact × breadth scoring pattern from research-oriented analysis skills. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace "brief explanation of the problem" with "what specifically goes wrong if this is not fixed." This is the articulation test pattern from research-oriented analysis skills — every finding must justify itself with concrete impact reasoning, not just pattern-matching against a checklist. Findings like "this violates best practices" become impossible to write when the prompt demands specific harm. This is the single most effective noise reduction technique across the mop-mapping skill set. Applied to all review types: standard, dirty, range, security, and design. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The /rethink skill uses explicit evidence thresholds — "1 observation is a data point, 3+ is a pattern worth investigating." The /verify skill grounds every check in specific data. Applied here as negative prompt instructions that suppress the most common false positive categories: hypothetical issues in unseen code, style preferences, unfounded "missing tests" claims, and flagging patterns that match existing codebase conventions. Security reviews get a lighter version — they should still err toward reporting, but not flag theoretical vulnerabilities in untouched code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The /verify skill's "recite" phase is its most powerful technique: read only the title, predict what the content should be, then check alignment. Applied here by reversing the old instruction "Do not review the commit message" — the commit message now becomes the primary lens for evaluating the diff. When a commit says "fix race condition" but the diff adds a mutex on the wrong resource, that's a high-value finding that pure diff-scanning misses. Intent-implementation gaps are now the first check category, above bugs and security, because they catch the class of errors where the code is internally consistent but doesn't do what the developer intended. The dirty-changes prompt is unchanged since uncommitted changes have no commit message to analyze. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The /verify and /synthesize skills both enforce quality gates — checks that must pass before output is considered complete. Applied here as a final self-verification instruction: every finding must reference a specific diff location, severity must match the described impact, and no two findings may contradict each other. Findings that fail these checks are dropped. This catches the most embarrassing review failures (high-severity verdict with no actual line references, "pass" with critical findings listed) at near-zero cost since the model performs the check during the same generation. Applied to all review types: standard, dirty, range, and security. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The /rethink skill's evidence accumulation pattern — "1 observation is a data point, 3+ is a pattern worth investigating" — directly applies to the insights system. Without explicit thresholds, the insights agent may recommend guideline changes from 1-2 occurrences (noise) or hesitate on strong 6+ patterns. Added tiered thresholds to the recurring patterns section and gated guideline suggestions on minimum 3 occurrences. This helps close the feedback loop between review noise and guideline refinement with appropriate confidence levels. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-25T13:23:30Z

roborev: Combined Review (`3934535`)

Summary Verdict: The changes successfully tighten review prompts with severity definitions and evidence thresholds, but introduce regressions by restricting blind-spot-driven guideline generation and creating false-negative paths for
security vulnerabilities outside the immediate diff.

Medium Severity

Location: internal/prompt/prompt.go:39, 78, 108, 921
- Problem: The new instructions tell reviewers not to report issues in code not shown in the diff
  and require a "plausible exploit path visible in the diff" for security reviews. A malicious contributor can exploit this by submitting a small change that connects external input to an existing dangerous sink outside the patch. The prompt directs the reviewer to drop the finding if the sink or full taint flow lives in unchanged code, creating a
  predictable false-negative path for real security bugs.
- Fix: Keep the anti-speculation guardrail, but explicitly allow reviewers to inspect unchanged surrounding code when needed to validate whether the changed path introduces a vulnerability. A safer rule is: "Do not speculate without evidence, but you may use nearby unchanged
  code to confirm whether the diff creates or exposes an exploit path."
Location: internal/prompt/insights.go:53
- Problem: Section 5 now allows guideline suggestions only from section 1 or section 3 evidence, but excludes section 2 recurring blind spots. This
  means the insights pass can identify a repeated missing-guideline pattern in section 2 and still be unable to recommend the corresponding guideline text, which is a direct regression in the output's usefulness.
- Fix: Allow section 2 patterns with the same evidence threshold to feed section 5 guideline suggestions.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

mariusvniekerk and others added 6 commits March 24, 2026 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden review prompts for consistency and noise reduction#579

Harden review prompts for consistency and noise reduction#579
mariusvniekerk wants to merge 6 commits intomainfrom
review-skill-improver

mariusvniekerk commented Mar 25, 2026

Uh oh!

roborev-ci bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mariusvniekerk commented Mar 25, 2026

Summary

Uh oh!

roborev-ci bot commented Mar 25, 2026

roborev: Combined Review (3934535)

Medium Severity

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`3934535`)