Address actionable follow-up issues by wwind123 · Pull Request #55 · wwind123/coding-review-agent-loop

wwind123 · 2026-05-02T07:59:55Z

Summary

Deduplicate approved-review future follow-up issues across reviewers, preserve merged reviewer attribution, and de-duplicate created issue links in the PR summary.
Keep explicit workdirs conservative while cleaning dirty tool-owned default temp checkouts before reuse.
Add scratch-file prompt guidance and update docs for disposable default temp workdirs.

Issue handling

Closes #52
Closes #50
Closes #47
Closes #40
Closes #38
Closes #37

Open issues #33 and #34 are already addressed by current URL surfacing on main. Open issues #36 and #39 are already addressed by the current follow-up issue body, which no longer hardcodes an OpenAI Codex footer. I left #41 alone because the dry-run behavior is consistent with the existing dry-run contract and does not warrant more issue-mode special casing in this pass.

Tests

python -m pytest -q
python -m compileall -q src tests
git diff --check

wwind123 · 2026-05-02T08:00:04Z

Implemented and opened PR #55: #55

Covered the current actionable issue set: follow-up issue dedupe/attribution, duplicate summary URL handling, all-None summary suppression, default temp workdir cleanup, explicit workdir remote/dirty validation, scratch-file prompt guidance, and docs updates.

Verified with:

python -m pytest -q (121 passed)
python -m compileall -q src tests
git diff --check

-- OpenAI Codex

wwind123 · 2026-05-02T08:03:05Z

I now have a complete picture of the PR. Let me compile my review.

PR #55 Review: Address actionable follow-up issues

Summary of changes

This PR bundles six issue fixes across three domains:

Followup deduplication (orchestrator.py): _dedupe_approved_followups groups identical followups from multiple reviewers, preserving merged attribution and emitting one issue per unique item. The normalization strategy (full-text key + optional bold-heading key) is sound.
Summary suppression (orchestrator.py): The comment summarizing created issue URLs is now guarded by if issue_urls:, suppressing the all-None case that previously posted misleading "Created issue URL unavailable" noise. URL-level deduplication via dict.fromkeys in _format_created_followup_issue_summary is a safe secondary guard.
Auto workdir cleanup (config.py): Dirty default temp checkouts now log a warning and run git reset --hard + git clean -fd before syncing, matching the "tool-owned disposable" contract documented in README and docs.
Explicit workdir validation (config.py): New validate_explicit_workdir correctly fails on dirty git dirs and wrong-repo remotes, while silently skipping non-git dirs (e.g., fresh empty dirs passed by users). The check ordering (dirty → remote) is fine — either error blocks the run.
Scratch-file guidance (prompts.py): _scratch_file_guidance() is injected into all five prompt builders. The placement in build_task_prompt (between "End your final response with both markers:" and the marker block) is slightly awkward but doesn't affect agent behavior.
Docs (README.md, docs/local_agent_loop.md): Accurately reflect the new auto-dir vs. explicit-dir contracts.

Correctness

_dedupe_approved_followups: Key lookup is correct. Empty-string keys are guarded by if key in indexes.update. The heading-key secondary match correctly narrows false-positive deduplication to structurally similar items.
validate_explicit_workdir calls remote get-url origin with default check=True. If a git repo has no origin remote, this raises a raw exception rather than an AgentLoopError. The same pattern exists in ensure_temp_checkout, so this is consistent pre-existing behavior, not a regression.
issue_urls suppression: when all create_issue calls return None (CLI output not captured), the summary comment is silently dropped rather than posting a misleading "unavailable" message. This is intentional and tested.
Title construction for grouped followups (_followup_issue_title(ApprovedFollowup(reviewer=followup.reviewers[0], text=followup.text))) is correct — title uses the deduped text and first reviewer name.

Test coverage

New tests cover all four workdir scenarios: auto-dir cleanup (test_dirty_existing_auto_agent_dir_is_cleaned_before_sync), explicit-dir dirty rejection (test_dirty_explicit_agent_dir_fails_clearly), wrong-repo rejection (test_explicit_agent_dir_must_match_requested_repo).
test_pr_loop_deduplicates_approved_followup_issues_across_reviewers exercises the 2-reviewer dedup path end-to-end and verifies merged attribution body format.
test_pr_loop_suppresses_followup_issue_summary_when_no_urls_returned correctly validates the suppression behavior.
121 tests pass (per PR author verification).

No blocking issues found.

-- Anthropic Claude

wwind123 · 2026-05-02T08:04:41Z

The PR looks solid and addresses the targeted follow-up issues effectively. The deduplication logic significantly improves the experience when using multiple reviewers, and the workdir cleanup/validation adds helpful safety and convenience.

Same-PR follow-ups

None

Future follow-ups

None

-- Google Gemini

wwind123 · 2026-05-02T08:05:20Z

No code changes were needed. I verified PR #55’s live comments: Claude approved with no blocking items, and Gemini’s same-PR follow-ups are None.

Validation run:
python -m pytest -q → 121 passed

Branch is clean and already synced with origin/codex/address-open-followups, so there was nothing to commit or push.

-- OpenAI Codex

wwind123 · 2026-05-02T08:05:35Z

Both Claude (round 1) and Gemini have already approved this PR with no blocking issues and no same-PR follow-ups. The head SHA is unchanged (4826381). My round 1 analysis stands — the changes are correct, well-tested, and all six targeted issues are properly addressed. No new information has emerged since that review.

-- Anthropic Claude

wwind123 · 2026-05-02T08:06:33Z

I have reviewed pull request #55 "Address actionable follow-up issues" (round 2). The code remains unchanged since the previous approval round (Head SHA: 48263813c3435cf922f99333e5d0b598ec2cbea2), and my analysis confirms that the changes are correct and effectively address the targeted issues.

Review Summary

Follow-up Deduplication: The grouping logic in orchestrator.py correctly handles identical or highly similar follow-ups from multiple reviewers, preserving attribution while reducing issue churn. The normalization strategy is robust.
Workdir Management: The updated logic for cleaning dirty default workdirs and validating explicit workdirs adds necessary safety and convenience. The distinction between "tool-owned disposable" and "user-owned conservative" directories is well-documented and correctly implemented.
Prompt Guidance: The inclusion of scratch-file guidance in the agent prompts is a helpful addition to maintain repo cleanliness.
Summary Suppression: Silencing the issue summary comment when no issues are created (e.g., in CLI mode where output isn't captured) removes misleading "unavailable URL" messages.
Testing: New tests provide excellent coverage for all scenarios described in the PR. All 121 tests passed successfully in my verification run.

Same-PR follow-ups

None

Future follow-ups

None

-- Google Gemini

wwind123 · 2026-05-02T08:07:20Z

No code changes were needed. I verified PR #55’s live comments: Claude and Gemini both approved round 2 with same-PR follow-ups listed as None.

Validation:
python -m pytest -q -> 121 passed

Branch is clean and synced with origin/codex/address-open-followups at 48263813c3435cf922f99333e5d0b598ec2cbea2, so there was nothing to commit or push.

-- OpenAI Codex

wwind123 · 2026-05-02T08:07:32Z

The head SHA is unchanged (4826381). Both Claude and Gemini approved in round 2, Codex confirmed no changes were needed and 121 tests pass. Nothing has changed since my round 2 approval. My analysis from rounds 1 and 2 stands.

-- Anthropic Claude

wwind123 · 2026-05-02T08:08:32Z

I have reviewed pull request #55 "Address actionable follow-up issues" (round 3). The changes are correct, well-tested, and effectively address the targeted issues.

Review Summary

Follow-up Deduplication: The grouping logic in orchestrator.py correctly handles identical or highly similar follow-ups from multiple reviewers, preserving attribution while reducing issue churn. The normalization strategy is robust and properly handles varied formatting.
Workdir Management: The updated logic for cleaning dirty default (tool-owned) workdirs and validating explicit (user-owned) workdirs adds necessary safety and convenience. The distinction between disposable and conservative directories is well-documented and correctly implemented.
Prompt Guidance: The inclusion of scratch-file guidance in the agent prompts is a helpful addition to maintain repository cleanliness during automated tasks.
Summary Suppression: Silencing the issue summary comment when no issues are created (e.g., in CLI mode where output isn't captured) removes misleading "unavailable URL" messages and reduces noise.
Testing: The new tests provide excellent coverage for all scenarios described in the PR, including complex deduplication cases and workdir validation. All 121 tests passed successfully in my verification run.

Same-PR follow-ups

None

Future follow-ups

None

-- Google Gemini

wwind123 · 2026-05-02T08:09:05Z

No code changes were needed. I verified PR #55’s live comments: Claude and Gemini both approved round 3 with same-PR follow-ups listed as None.

Validation:
python -m pytest -q -> 121 passed

Branch is clean and synced at 48263813c3435cf922f99333e5d0b598ec2cbea2, matching origin/codex/address-open-followups and PR #55 head, so there was nothing to commit or push.

-- OpenAI Codex

wwind123 · 2026-05-02T08:09:16Z

Head SHA is still 4826381 — no changes since rounds 1–3. Both Claude and Gemini approved in round 3, Codex confirmed 121 tests pass and the branch is clean. My approval from prior rounds stands unchanged.

-- Anthropic Claude

wwind123 · 2026-05-02T08:10:26Z

Warning: True color (24-bit) support not detected. Using a terminal with true color enabled will result in a better visual experience.
YOLO mode is enabled. All tool calls will be automatically approved.
YOLO mode is enabled. All tool calls will be automatically approved.
I will begin by retrieving the pull request details and the diff to understand the scope of the changes and any existing context from the PR conversation.

I will check the definition of _run_git in src/coding_review_agent_loop/config.py to see how it handles command failures.

I will check src/coding_review_agent_loop/protocol.py to understand the Runner protocol and its run method.

I will check src/coding_review_agent_loop/runner.py to find the Runner class and its run method.

I will run the tests to verify the changes and ensure everything is working as expected.

I will check the post_pr_comment function in src/coding_review_agent_loop/orchestrator.py to see how it behaves in dry-run mode.

I will read the post_pr_comment function in src/coding_review_agent_loop/github.py.

I will check if _format_created_followup_issue_summary is used anywhere else in the codebase.

I have reviewed pull request #55 "Address actionable follow-up issues" (round 4). The changes are correct, well-tested, and effectively consolidate several outstanding follow-up items into a cohesive update.

Review Summary

Follow-up Deduplication: The grouping logic in orchestrator.py successfully handles identical or similar follow-ups from multiple reviewers. The normalization strategy (removing formatting, handling bold headings, and word-based heuristics) is robust and correctly preserves attribution for all reviewers in the resulting issue body.
Workdir Management: The implementation of auto-cleanup for "tool-owned" default workdirs and stricter validation for "user-owned" explicit workdirs provides a clear and safe contract for directory reuse. The remote validation check ensures that explicit directories point to the expected repository.
Prompt Guidance: The addition of scratch-file guidance to the agent prompts is a proactive measure to maintain repository cleanliness and guide agents toward better workspace habits.
Summary Suppression: Silencing the issue summary comment when no URLs are returned (common in CLI/headless environments or dry runs) reduces noise and avoids misleading "unavailable URL" messages.
Testing: New test cases provide thorough coverage for deduplication edge cases and workdir validation scenarios. All 121 tests passed successfully in my verification run.

Same-PR follow-ups

None

Future follow-ups

None

-- Google Gemini

wwind123 · 2026-05-02T08:10:58Z

No code changes were needed. I verified PR #55’s live comments: Claude and Gemini both approved round 4 with same-PR follow-ups listed as None.

Validation:
python -m pytest -q -> 121 passed

Branch is clean and synced with origin/codex/address-open-followups at 48263813c3435cf922f99333e5d0b598ec2cbea2, so there was nothing to commit or push.

-- OpenAI Codex

wwind123 · 2026-05-02T08:11:10Z

Head SHA remains 4826381. Both reviewers approved in round 4, 121 tests pass, no changes have been made. My approval stands.

-- Anthropic Claude

wwind123 · 2026-05-02T08:11:52Z

I have reviewed pull request #55 "Address actionable follow-up issues" (round 5). The changes are correct, well-tested, and effectively address the targeted follow-up items.

Review Summary

Follow-up Deduplication: The grouping logic in orchestrator.py correctly handles identical or highly similar follow-ups from multiple reviewers, preserving attribution while reducing issue noise. The normalization strategy is robust.
Workdir Management: The updated logic for cleaning dirty default (tool-owned) workdirs and validating explicit (user-owned) workdirs adds necessary safety and convenience. The distinction between disposable and conservative directories is well-implemented and documented.
Prompt Guidance: The inclusion of scratch-file guidance in agent prompts is a proactive measure to maintain repository cleanliness.
Summary Suppression: Silencing the issue summary comment when no issues are created removes misleading "unavailable URL" messages and reduces noise in CLI or dry-run environments.
Testing: The new tests provide excellent coverage for the added features, including complex deduplication cases and workdir validation scenarios. All 121 tests passed.

-- Google Gemini

wwind123 · 2026-05-02T08:11:55Z

Created approved-review future follow-up issues for PR #55:

These were mentioned in approved reviews as future work and did not block merge readiness.

Skipped 1 additional item(s) to avoid issue noise; reviewers should reserve this section for substantial independent follow-up work.

-- coding-review-agent-loop

Address actionable follow-up issues

4826381

wwind123 mentioned this pull request May 2, 2026

Follow up future review note: None #56

Closed

This was referenced May 2, 2026

Follow up future review note: None #57

Closed

Follow up future review note: None #58

Closed

wwind123 merged commit c3029ca into main May 2, 2026
1 check passed

wwind123 mentioned this pull request May 2, 2026

Clean dirty default agent workdirs #49

Closed

Conversation

wwind123 commented May 2, 2026

Summary

Issue handling

Tests

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

PR #55 Review: Address actionable follow-up issues

Summary of changes

Correctness

Test coverage

Uh oh!

wwind123 commented May 2, 2026

Same-PR follow-ups

Future follow-ups

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Review Summary

Same-PR follow-ups

Future follow-ups

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Review Summary

Same-PR follow-ups

Future follow-ups

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Review Summary

Same-PR follow-ups

Future follow-ups

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

wwind123 commented May 2, 2026

Review Summary

Uh oh!

wwind123 commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant