Skip to content

Rework pr-reviewer workflow around suggestions#707

Open
aram356 wants to merge 6 commits into
mainfrom
pr-reviewer-iterative-workflow
Open

Rework pr-reviewer workflow around suggestions#707
aram356 wants to merge 6 commits into
mainfrom
pr-reviewer-iterative-workflow

Conversation

@aram356

@aram356 aram356 commented May 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Reworks the pr-reviewer agent around a single review pass that posts GitHub review comments and user-approved suggestion blocks, instead of pushing fixes or opening a stacked fix-up PR.
  • Adds explicit PR, branch-remote, and branch-local modes with private review refs, merge-base-aware diff handling, branch-only output, and a head re-check before submission.
  • Tightens CI and submission behavior: required checks are classified from gh pr checks --json buckets, failed required CI becomes a blocking body-level finding, pending reviews are preserved unless the user chooses to delete them, and review payloads are built as strict JSON.

Changes

File Change
.claude/agents/pr-reviewer.md Replaces the previous fix-up-PR loop with a one-pass review workflow. Documents mode resolution, worktree setup, full changed-file reads, CI classification, finding triage, GitHub suggestion eligibility, scratch verification, verdict precedence, branch-only rendering, PR head drift handling, pending-review handling, and safe gh api review submission.

Closes

Closes #706

Test plan

This is a workflow/documentation-only change under .claude/agents/; no Rust or JS source is affected.

  • git diff --check -- .claude/agents/pr-reviewer.md
  • Mocked CI capture cases:
    • nonzero exit with JSON -> classify from JSON
    • exit 8 with no JSON -> pending/no-output note
    • nonzero exit with no JSON -> command/API/auth diagnostic
  • cargo test --workspace - n/a
  • cargo clippy --workspace --all-targets --all-features -- -D warnings - n/a
  • cargo fmt --all -- --check - n/a
  • JS tests: cd crates/js/lib && npx vitest run - n/a
  • JS format: cd crates/js/lib && npm run format - n/a
  • Docs format: cd docs && npm run format - n/a, file is outside docs/

Checklist

  • Changes follow CLAUDE.md conventions
  • No production code changed
  • No secrets or credentials committed

The pr-reviewer agent now iterates review → implement → re-review on
a stacked fix-up PR until a full pass surfaces no actionable findings,
rather than producing a single read-only review.

Key workflow changes:
 - Re-resolve the PR's current head at the start of every pass; work
   against origin/<headRefName> rather than a previously checked-out copy
 - Triage each finding twice: include in review, and implement as code
 - One fix-up branch / PR per review engagement, reused across passes:
   branch `review/<timestamp>-<pr-number>` (UTC YYYYMMDD-HHMMSS), title
   `<timestamp> Review fixes for #<pr-number>`, base = the PR's head
 - Inline comments and the review-body summary reference the fix-up PR
   for findings that were implemented; verdict no longer forces
   REQUEST_CHANGES when all wrenches are addressed in the fix-up PR
 - Stop conditions: no new actionable findings (ideal merge candidate),
   blocked on author, or user says so
 - Rules forbid pushing or submitting without explicit user approval,
   targeting `main` from a fix-up PR, or skipping --force-with-lease
   after a rebase

Closes #706
@aram356 aram356 self-assigned this May 17, 2026

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review by Yesman.

I found one blocking correctness issue in the updated pr-reviewer instructions. The PR adds a stale-head guardrail in Step 1, but Step 2 still says to enumerate changed files with git diff main...HEAD --name-only (.claude/agents/pr-reviewer.md, current line 59). Because that line is not part of the submitted diff, I could not attach this as an inline comment.

This should use the just-fetched PR head, not the reviewer worktree's local HEAD. Otherwise later passes can enumerate files from an old checkout, a fix-up branch, or another branch entirely, causing the agent to miss current PR changes before submitting a review. Please make it consistent with Step 1, for example git diff origin/<baseRefName>...origin/<headRefName> --name-only after fetching the base, or git diff main...origin/<headRefName> --name-only if main is intentionally fixed.

CI is mostly passing with some checks pending at review time.

Comment thread .claude/agents/pr-reviewer.md Outdated
@aram356 aram356 changed the title Make pr-reviewer agent iterate to an ideal merge candidate Rework pr-reviewer workflow around suggestions Jun 9, 2026

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 against origin/main at head 5ea9211305c040fe8ddce01da70b392aa63db554. The workflow rewrite is generally careful around stale heads, suggestion safety, pending reviews, and JSON review submission, but I found one CI-coverage gap that can cause the reviewer to miss failing Trusted Server gates.

Findings

P1 / High

  1. Do not limit CI classification to branch-protection-required checks.claude/agents/pr-reviewer.md:370
    • Issue: The new CI collection uses gh pr checks --required and the body template later says to render only checks reported by that required-only query. In this repository, that currently returns only cargo test, format-typescript, cargo fmt, and format-docs; it omits checks such as Analyze (rust), vitest, CodeQL, browser integration tests, and integration tests.
    • Why it matters: CLAUDE.md treats those broader workflows as PR gates. If one of the omitted checks fails, the reviewer can still report clean CI and potentially approve/comment with no failed-CI finding, hiding a real regression in Trusted Server's Rust/JS/integration validation.
    • Suggested fix: Query and report all PR checks (for example, gh pr checks "$NUMBER" --json name,bucket,state,link), then separately mark which are required if branch-protection status matters. At minimum, classify failures/cancellations from the full check set as review findings, while preserving the existing required-check handling for merge-blocking branch protection.

CI / Existing Reviews

CI is currently passing. Existing review feedback included an older stale-HEAD concern that appears addressed by the current $DIFF_RANGE / private-ref workflow; I did not duplicate that finding.

Comment thread .claude/agents/pr-reviewer.md Outdated
aram356 added 2 commits June 15, 2026 15:58
…ecks

The CI collection used gh pr checks --required, which in this repo returns
only the branch-protection-required checks and omits gates like Analyze
(rust), vitest, CodeQL, and integration tests. A failure in one of those
could be missed, letting the reviewer report clean CI and approve over a
real regression.

Query the full check set for failure classification, capture the required
names separately so failed required checks can still be flagged as
merge-blocking, and render every reported check (annotated as required) in
the CI Status section.

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 at head a7b267c955d45a27a309b23e7bef33b787ae288f. The workflow rewrite is careful overall and the previously noted stale-head and required-only CI issues appear addressed in this revision. I found one non-blocking robustness gap around reusing the agent-owned worktree after an interrupted suggestion-verification pass; no blocking correctness, security, data-loss, authorization, or severe compatibility issues found.

Findings

  • P2: Reused reviewer worktrees should clear ignored JS bundles before a new pass — see inline comment.

CI / Existing Reviews

All GitHub checks currently pass. Existing review feedback covered the stale HEAD diff issue and the required-only CI query gap; I did not duplicate those findings.

# (which could have moved between the verify and the reset), so
# don't fetch again; just reset to the already-verified $HEAD_REF.
git -C "$WT" reset --hard "$HEAD_REF"
git -C "$WT" clean -fd

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: ♻️ refactor — Reused reviewer worktrees should clear ignored JS bundles before a new pass.

The reuse path resets tracked files and removes untracked non-ignored files, but it leaves ignored artifacts such as crates/js/dist/ behind. Later in the workflow we explicitly call out that a stale crates/js/dist/ can leak into Rust compilation via the JS bundle build path, and the per-suggestion loop only removes it when the previous iteration in the same invocation set prev_iteration_ran_js_build=1. If an earlier invocation is interrupted after running node build-all.mjs, the next invocation reuses the same $WT and can start scratch verification with stale ignored bundles from the old head/suggestion, which can produce misleading compile/test results.

Suggested fix: in the existing-worktree setup path, clear known ignored generated inputs that affect verification before starting the new pass, e.g. add git -C "$WT" clean -fdx crates/js/dist (or an equivalent targeted removal) alongside this reset/clean sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make pr-reviewer agent iterate to an ideal merge candidate

2 participants