Rework pr-reviewer workflow around suggestions#707
Conversation
The pr-reviewer agent now iterates review → implement → re-review on a stacked fix-up PR until a full pass surfaces no actionable findings, rather than producing a single read-only review. Key workflow changes: - Re-resolve the PR's current head at the start of every pass; work against origin/<headRefName> rather than a previously checked-out copy - Triage each finding twice: include in review, and implement as code - One fix-up branch / PR per review engagement, reused across passes: branch `review/<timestamp>-<pr-number>` (UTC YYYYMMDD-HHMMSS), title `<timestamp> Review fixes for #<pr-number>`, base = the PR's head - Inline comments and the review-body summary reference the fix-up PR for findings that were implemented; verdict no longer forces REQUEST_CHANGES when all wrenches are addressed in the fix-up PR - Stop conditions: no new actionable findings (ideal merge candidate), blocked on author, or user says so - Rules forbid pushing or submitting without explicit user approval, targeting `main` from a fix-up PR, or skipping --force-with-lease after a rebase Closes #706
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review by Yesman.
I found one blocking correctness issue in the updated pr-reviewer instructions. The PR adds a stale-head guardrail in Step 1, but Step 2 still says to enumerate changed files with git diff main...HEAD --name-only (.claude/agents/pr-reviewer.md, current line 59). Because that line is not part of the submitted diff, I could not attach this as an inline comment.
This should use the just-fetched PR head, not the reviewer worktree's local HEAD. Otherwise later passes can enumerate files from an old checkout, a fix-up branch, or another branch entirely, causing the agent to miss current PR changes before submitting a review. Please make it consistent with Step 1, for example git diff origin/<baseRefName>...origin/<headRefName> --name-only after fetching the base, or git diff main...origin/<headRefName> --name-only if main is intentionally fixed.
CI is mostly passing with some checks pending at review time.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 against origin/main at head 5ea9211305c040fe8ddce01da70b392aa63db554. The workflow rewrite is generally careful around stale heads, suggestion safety, pending reviews, and JSON review submission, but I found one CI-coverage gap that can cause the reviewer to miss failing Trusted Server gates.
Findings
P1 / High
- Do not limit CI classification to branch-protection-required checks —
.claude/agents/pr-reviewer.md:370- Issue: The new CI collection uses
gh pr checks --requiredand the body template later says to render only checks reported by that required-only query. In this repository, that currently returns onlycargo test,format-typescript,cargo fmt, andformat-docs; it omits checks such asAnalyze (rust),vitest, CodeQL, browser integration tests, and integration tests. - Why it matters:
CLAUDE.mdtreats those broader workflows as PR gates. If one of the omitted checks fails, the reviewer can still report clean CI and potentially approve/comment with no failed-CI finding, hiding a real regression in Trusted Server's Rust/JS/integration validation. - Suggested fix: Query and report all PR checks (for example,
gh pr checks "$NUMBER" --json name,bucket,state,link), then separately mark which are required if branch-protection status matters. At minimum, classify failures/cancellations from the full check set as review findings, while preserving the existing required-check handling for merge-blocking branch protection.
- Issue: The new CI collection uses
CI / Existing Reviews
CI is currently passing. Existing review feedback included an older stale-HEAD concern that appears addressed by the current $DIFF_RANGE / private-ref workflow; I did not duplicate that finding.
…ecks The CI collection used gh pr checks --required, which in this repo returns only the branch-protection-required checks and omits gates like Analyze (rust), vitest, CodeQL, and integration tests. A failure in one of those could be missed, letting the reviewer report clean CI and approve over a real regression. Query the full check set for failure classification, capture the required names separately so failed required checks can still be flagged as merge-blocking, and render every reported check (annotated as required) in the CI Status section.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 at head a7b267c955d45a27a309b23e7bef33b787ae288f. The workflow rewrite is careful overall and the previously noted stale-head and required-only CI issues appear addressed in this revision. I found one non-blocking robustness gap around reusing the agent-owned worktree after an interrupted suggestion-verification pass; no blocking correctness, security, data-loss, authorization, or severe compatibility issues found.
Findings
- P2: Reused reviewer worktrees should clear ignored JS bundles before a new pass — see inline comment.
CI / Existing Reviews
All GitHub checks currently pass. Existing review feedback covered the stale HEAD diff issue and the required-only CI query gap; I did not duplicate those findings.
| # (which could have moved between the verify and the reset), so | ||
| # don't fetch again; just reset to the already-verified $HEAD_REF. | ||
| git -C "$WT" reset --hard "$HEAD_REF" | ||
| git -C "$WT" clean -fd |
There was a problem hiding this comment.
Automated review: ♻️ refactor — Reused reviewer worktrees should clear ignored JS bundles before a new pass.
The reuse path resets tracked files and removes untracked non-ignored files, but it leaves ignored artifacts such as crates/js/dist/ behind. Later in the workflow we explicitly call out that a stale crates/js/dist/ can leak into Rust compilation via the JS bundle build path, and the per-suggestion loop only removes it when the previous iteration in the same invocation set prev_iteration_ran_js_build=1. If an earlier invocation is interrupted after running node build-all.mjs, the next invocation reuses the same $WT and can start scratch verification with stale ignored bundles from the old head/suggestion, which can produce misleading compile/test results.
Suggested fix: in the existing-worktree setup path, clear known ignored generated inputs that affect verification before starting the new pass, e.g. add git -C "$WT" clean -fdx crates/js/dist (or an equivalent targeted removal) alongside this reset/clean sequence.
Summary
pr-revieweragent around a single review pass that posts GitHub review comments and user-approvedsuggestionblocks, instead of pushing fixes or opening a stacked fix-up PR.gh pr checks --jsonbuckets, failed required CI becomes a blocking body-level finding, pending reviews are preserved unless the user chooses to delete them, and review payloads are built as strict JSON.Changes
.claude/agents/pr-reviewer.mdgh apireview submission.Closes
Closes #706
Test plan
This is a workflow/documentation-only change under
.claude/agents/; no Rust or JS source is affected.git diff --check -- .claude/agents/pr-reviewer.mdcargo test --workspace- n/acargo clippy --workspace --all-targets --all-features -- -D warnings- n/acargo fmt --all -- --check- n/acd crates/js/lib && npx vitest run- n/acd crates/js/lib && npm run format- n/acd docs && npm run format- n/a, file is outsidedocs/Checklist