ci: claude-review as github-actions[bot], branded, injection-hardened#33
Conversation
Pass the default GITHUB_TOKEN via github_token so claude-code-action operates as github-actions[bot] (the same identity codex uses) instead of the Claude GitHub App. Review comments no longer show 'Claude' as the author. 🤖 Built with SMT <smt@agora.build>
🤖 Codex Code ReviewNo findings. The change is narrowly scoped to Residual risk: I did not verify the upstream action implementation, so the only behavior to confirm after merge is that |
Prompt now instructs the review to start with '## 🤖 Claude Code Review' and omit the PR title heading, removing the 'Code Review: <title>' line. 🤖 Built with SMT <smt@agora.build>
|
Claude finished @guohai's task in 36s —— View job 🤖 Claude Code Review
The change is small, correct, and matches its stated goal. Passing Permissions — the job already declares Fork PRs — the one thing to confirm. The trigger is
Minor (not a bug). Self-note is accurate — because this edits No changes requested. LGTM, with the fork-PR behavior as the only thing worth confirming against your contribution model. |
🤖 Codex Code ReviewNo findings. I reviewed the PR diff for Residual notes:
|
…header) Switch claude-review to plain automation: Claude returns the review as its result, and a github-script step posts it as '## 🤖 Claude Code Review' + content (authored by github-actions[bot]). Drops track_progress so the hardcoded 'Claude finished … View job' line no longer appears. Mirrors the codex-review pattern. 🤖 Built with SMT <smt@agora.build>
🤖 Codex Code ReviewFindings
Open Questions
|
🤖 Claude Code ReviewCode Review: PR #33 — claude-review posts as github-actions[bot]I reviewed the full rewrite of OverallThe workflow is reasonable and the new "let Claude return the review, we post it ourselves via github-script" pattern is sound. The JSON parsing in the Post step is defensively written (try/catch, result-then-assistant fallback, handles string vs. array content). No credential leakage — secrets are referenced via A few points worth raising: 1. The stated mechanism doesn't match the implementation (not a bug, but the PR description is misleading)The PR claims authorship becomes So the 2. Prompt injection via untrusted PR fields (primary security concern)
Mitigating factor: this is the 3. Minor
VerdictNo blocking issues. The change achieves bot authorship (via the github-script step) and is safe under the |
Start directly with findings; we prepend the '## 🤖 Claude Code Review' header, so the model's own 'Code Review: <title>' heading is redundant. 🤖 Built with SMT <smt@agora.build>
🤖 Claude Code ReviewI reviewed the full cumulative diff (commits Findings1. Prompt injection into a write-capable agent (security — main concern)
PR Title: ${{ github.event.pull_request.title }}
PR Description: ${{ github.event.pull_request.body }}The only thing stopping a malicious PR description from instructing Claude to post comments, approve, or take other write actions is the natural-language instruction "do not post any comments yourself." That is a weak control — prompt instructions are exactly what injection overrides. The diff content Claude reads from the checked-out tree is similarly untrusted. Mitigating factor worth confirming: this is a 2. Untrusted-ish value interpolated into a shell
|
🤖 Codex Code ReviewFindings
Open Questions
|
Address the review's main finding: the agent was handed a write-capable token while untrusted PR title/diff were injected into the prompt, guarded only by a text instruction. Now: - 'review' job runs the agent with a READ-ONLY token (contents+PR read), so injected instructions cannot escalate to write actions; PR metadata is fenced as explicitly-untrusted data. - separate 'post' job (pull-requests: write, no model) reads the produced review from an artifact and posts it via github-script. - base.ref no longer interpolated into the shell run block (env var instead). - empty/unparseable review now fails the step instead of silently no-op. 🤖 Built with SMT <smt@agora.build>
🤖 Claude Code ReviewOverall this is a well-conceived security redesign: splitting the model-driven review (read-only token) from the comment posting (write token, no model) is the right pattern, and it correctly neutralizes prompt-injection escalation. A few real issues: Functional regressionsSticky comment behavior lost → comment pile-up. The old config used concurrency:
group: claude-review-${{ github.event.pull_request.number }}
cancel-in-progress: true(Note concurrency alone won't dedupe across the PR lifetime — only the upsert fixes the pile-up.) Fork PRs will fail to post. For MinorComment header is slightly overstated. "nothing untrusted reaches a writer" — the model's review text is itself influenced by the attacker-controlled diff/files the agent reads, and that text is posted verbatim as a comment. The blast radius is limited to comment content (the writer runs no model and can't be made to use
Looks good
The core security goal is achieved. The comment pile-up is the one I'd fix before merging. |
🤖 Codex Code ReviewFindings
|
The agent gets ANTHROPIC_API_KEY/BASE_URL in env and its output is posted verbatim, so a malicious same-repo PR could try to exfiltrate secrets via the review comment (read-only GitHub token doesn't stop that). Restrict the review job to OWNER/MEMBER/COLLABORATOR authors — they already have repo-secret access via branch workflows, so this adds no exposure; fork/untrusted PRs never reach the secret-bearing agent. post job cascades via needs. 🤖 Built with SMT <smt@agora.build>
🤖 Claude Code ReviewI've reviewed the full diff. This is a well-reasoned security hardening of the review workflow — the split into a read-only A few real issues: Comment spam — sticky behavior was dropped (maintainability regression)The old config used Recommend restoring sticky behavior: tag the comment body with a hidden marker (e.g.
|
🤖 Codex Code ReviewFindings
No other security or credential-exposure issues stood out in the workflow split; the read-only model job plus separate posting job is a solid improvement. |
What
Reworks
claude-code-review.ymlso the review comment is authored by github-actions[bot] (like codex), branded 🤖 Claude Code Review with no wrapper/title lines, and the agent is not given write access.Design (two jobs)
review— runsclaude-code-actionwith a read-only token (contents: read,pull-requests: read). PR title/diff are injected only as fenced untrusted data. The model emits the review as its result;execution_fileis uploaded as an artifact. Even if prompt-injected, the agent has no write capability.post— separate job,pull-requests: write, runs no model. Downloads the artifact and posts## 🤖 Claude Code Review+ content viagithub-script(default token →github-actions[bot]). Fails loudly if the review is empty/unparseable.Why
Replaces the previous
code-reviewplugin +track_progressflow, which (a) showed the hardcoded "Claude finished … View job" line and aCode Review: <title>heading, and (b) handed a write-capable token to the agent alongside untrusted PR text — flagged by both codex and Claude's own review as the main risk.Also addressed (from review)
base.refno longer interpolated into arun:block (uses anenv:var).Verification
On this PR:
reviewjob confirmedContents: read/PullRequests: read;postjob published a clean🤖 Claude Code Reviewcomment asgithub-actions[bot]; codex-review green.Generated with SMT smt@agora.io