Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,11 +124,11 @@ Five workflow files form the pipeline; each owns one responsibility.
- **Runners are explicitly pinned repo-wide.** Every `runs-on:` across `.github/workflows/` targets an explicit image (`ubuntu-24.04`), never a `*-latest` rolling alias, so a GitHub-side image bump can't silently change the OS / preinstalled tools / Docker daemon under a job. `ci.yml` runs `bun run check:runner-pins` (`scripts/check-runner-pins.ts`), which fails the build if any `runs-on:` is on a `*-latest` alias; matrix expressions (`runs-on: ${{ matrix.runner }}`) are exempt since the literal lives at the matrix entry. See issue #173.
- Defense-in-depth on workflow injection: every dynamic input flowing into a `run:` block is passed via `env:` first.
- **GitHub Actions are SHA-pinned.** Every third-party `uses:` reference is pinned to a full 40-char commit SHA (with a `# vX.Y.Z` comment), not a mutable tag, so a force-moved upstream tag cannot change the bytes a runner executes. Renovate keeps the SHAs current via the `helpers:pinGitHubActionDigests` preset behind the existing 7-day `minimumReleaseAge` soak. `ci.yml` runs `bun run check:action-pins` (`scripts/check-action-pins.ts`), which fails the build if any third-party `uses:` is on a tag. Local reusable-workflow calls (`uses: ./...`) are exempt.
- **Test-glob + destructive-action guards run in CI.** `ci.yml` also runs `bun run check:test-globs` (`scripts/check-test-globs.ts`, fails if any `*.test.ts` is unreachable by the runner glob set, issue #201) and `bun run check:no-destructive` (`scripts/check-no-destructive-actions.ts`, FR-009: fails if `src/workflows/ship/` or a `src/daemon/scoped-*-executor.ts` contains a force-push / `reset --hard` / `gh pr merge` / merge-mutation call outside comments; the scoped-executor set is derived from the filesystem so it cannot go stale, issue #203).
- **Test-glob + destructive-action guards run in CI.** `ci.yml` also runs `bun run check:test-globs` (`scripts/check-test-globs.ts`, fails if any `*.test.ts` is unreachable by the runner glob set, issue #201) and `bun run check:no-destructive` (`scripts/check-no-destructive-actions.ts`, FR-009: fails if `src/workflows/ship/` or a `src/daemon/scoped-*-executor.ts` contains a force-push / `reset --hard` / `gh pr merge` / merge-mutation call outside comments; the scoped-executor set is derived from the filesystem so it cannot go stale, issue #203). Its `FORBIDDEN` pattern set is shared with the paired runtime layer, the `PreToolUse` destructive-Bash hook (`src/core/hooks/forbidden-bash.ts`, security invariant #5), via `src/utils/forbidden-bash.ts` so the static and runtime gates stay in lockstep (issue #222).

## Security invariants (prompt-injection hardening)

Four contracts contributors MUST preserve when touching the agent execution path or any GitHub-bound write:
Five contracts contributors MUST preserve when touching the agent execution path or any GitHub-bound write:

1. **Subprocess env allowlist** (`src/core/executor.ts buildProviderEnv()`). The agent CLI receives an explicit allowlist + prefix patterns, NOT `...process.env`. If you add a new env var the CLI needs, extend the allowlist. Banned: `GITHUB_APP_PRIVATE_KEY`, `GITHUB_WEBHOOK_SECRET`, `DAEMON_AUTH_TOKEN`, `DATABASE_URL`, `VALKEY_URL`, `REDIS_URL`, `CONTEXT7_API_KEY`, `GITHUB_PERSONAL_ACCESS_TOKEN`. See `docs/operate/configuration.md` § "Subprocess env allowlist".

Expand All @@ -144,6 +144,8 @@ Four contracts contributors MUST preserve when touching the agent execution path

4. **Structured-output chokepoint** (`src/ai/structured-output.ts parseStructuredResponse()` + `src/utils/tolerant-json.ts`). Every LLM call site that expects a JSON-shaped response MUST route the model output through `parseStructuredResponse(raw, schema)` and append the encoding rules to its system prompt via `withStructuredRules(systemPrompt)`. The pipeline does code-fence stripping, strict JSON.parse, and a tolerant fallback that escapes raw LF/CR/TAB/control bytes inside JSON string values (the most common LLM failure mode, observed shipping on chat-thread). Returns a discriminated `StructuredResult<T>`; callers MUST switch on `result.ok` and own their failure policy (clarify fallback, fail-open, throw, etc.): the pipeline does not encode policy. **Wired today**: `workflows/intent-classifier.ts`, `workflows/ship/nl-classifier.ts`, `workflows/ship/scoped/chat-thread.ts`, `workflows/ship/scoped/meta-issue-classifier.ts`, `orchestrator/triage.ts`, `utils/llm-output-scanner.ts`, `workflows/handlers/triage.ts`. When adding a new LLM call site that expects JSON, do NOT call `JSON.parse` directly, route through this chokepoint so future LLM-quirk fixes apply uniformly. Strategy is observable via `result.strategy` (`"strict"` | `"tolerant"`) for monitoring model JSON-quality regressions.

5. **Runtime destructive-Bash gate** (`src/core/hooks/forbidden-bash.ts` + `src/utils/forbidden-bash.ts`). The agent subprocess runs under `bypassPermissions` with the Bash tool allowed, so a prompt-injected force-push / `git reset --hard` / `gh pr merge` / GraphQL merge mutation (`mergePullRequest` / `mergeBranch`) would otherwise execute unchecked. A `PreToolUse` hook (wired in `src/core/executor.ts` `queryOptions.hooks`, matcher `Bash`) denies any Bash command matching the shared `FORBIDDEN` pattern set at runtime, backing the prompt-only bans with an enforced gate. The patterns live in `src/utils/forbidden-bash.ts` as the single source of truth, shared with the static `check:no-destructive` CI guard (`scripts/check-no-destructive-actions.ts`) so the build-time and runtime layers cannot drift. A deny emits `event: "agent.hook.denied"` (fields `tool`, `rule`) and NEVER logs the raw command (token-leak risk). This stops literal destructive commands and is backed by remote branch protection plus a least-privilege token; a regex denylist on free-form shell is not a complete sandbox (it cannot defeat obfuscation like variable indirection or `base64|sh`). Issue #222.

The `triggerUsername` is rejected (not silently stripped) if it contains whitespace/newline, git commit trailer forging vector. Don't relax that check.

## Documentation
Expand Down
1 change: 1 addition & 0 deletions docs/build/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ flowchart TD
- **The webhook server never runs the pipeline.** Only daemons execute `runPipeline`. The webhook server is the orchestrator: it enqueues jobs and optionally spawns ephemeral daemons.
- **Every orchestrator runs a queue worker.** `src/orchestrator/queue-worker.ts` polls `queue:jobs` via `LMOVE` into a per-instance processing list (`queue:processing:{instanceId}`), offers the job to a locally-connected daemon, and atomically re-queues it to the head when no local daemon can take it. Multi-orchestrator HA: `LMOVE` grants exactly-once claim across instances; the offer/accept round-trip stays in-process. Crash recovery is handled by each orchestrator draining its own processing list at startup, plus a cross-instance reaper (`src/orchestrator/valkey-cleanup.ts`) draining processing lists owned by instances whose `orchestrator:{id}:alive` liveness key has expired.
- **MCP servers.** Tracking-comment updates, inline PR reviews, scoped review-thread resolves, daemon-capability reports, repo-memory, and (optionally) Context7 library docs are exposed as MCP servers the agent can call. Git changes are made via the Bash tool against the cloned repo, not through a dedicated MCP server.
- **Destructive Bash is runtime-gated.** The agent runs under `bypassPermissions` with the Bash tool allowed, so prompt-only bans alone do not stop a prompt-injected force-push or merge. A `PreToolUse` hook (`src/core/hooks/forbidden-bash.ts`, wired in `src/core/executor.ts`) denies any Bash command matching the shared `FORBIDDEN` set (force-push, `git reset --hard`, branch delete, history rewrite, `gh pr merge`, GraphQL merge mutations) before it executes. The pattern set is shared with the static `check:no-destructive` CI guard via `src/utils/forbidden-bash.ts`, so build-time and runtime gates cannot drift. A deny emits an `agent.hook.denied` log line.

## Dispatch flow

Expand Down
8 changes: 8 additions & 0 deletions docs/operate/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,14 @@ cache_read_input_tokens / (input_tokens + cache_read_input_tokens + cache_creati
| ----------------- | ----- | ------------------------------------------------------------------------------------------------------------ |
| `workspace.sweep` | info | `swept` (entries removed), `retained` (entries kept as fresh), `durationMs` (wall-clock time for the sweep). |

### Agent hook denials

The `PreToolUse` destructive-Bash hook (`src/core/hooks/forbidden-bash.ts`) emits one line each time it blocks a Bash command matching the shared `FORBIDDEN` set (force-push, `git reset --hard`, branch delete, history rewrite, `gh pr merge`, GraphQL merge mutations). A sustained `agent.hook.denied` rate means the agent is repeatedly attempting destructive operations, worth investigating as a possible prompt-injection signal. The raw command is never logged (token-leak risk); only the matched rule label is.

| `event` | Level | Fields |
| ------------------- | ----- | -------------------------------------------------------------------------------------------- |
| `agent.hook.denied` | warn | `tool` (always `Bash`), `rule` (the matched FORBIDDEN description, e.g. `git push --force`). |

## GitHub API rate-limit fields

The `App` is constructed with `ObservableOctokit` (`src/utils/octokit-observability.ts`), an `Octokit.plugin` subclass shared by `app.octokit` and every installation octokit. It logs GitHub's per-installation rate-limit headers via `octokit.hook.after` / `hook.error`. The `pipeline.stage`-style strict Zod schema (`GithubApiLogFieldsSchema`) pins the field shape.
Expand Down
19 changes: 2 additions & 17 deletions scripts/check-no-destructive-actions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
import { readdirSync, readFileSync, statSync } from "node:fs";
import { join } from "node:path";

import { FORBIDDEN } from "../src/utils/forbidden-bash";

// T046b scope: the NEW ship_intents lifecycle code under
// `src/workflows/ship/`. Legacy handlers under `src/workflows/handlers/`
// embed agent prompt strings that DOCUMENT what the agent must not do
Expand Down Expand Up @@ -52,23 +54,6 @@ function scopedExecutorFiles(): string[] {
return files;
}

const FORBIDDEN: { readonly pattern: RegExp; readonly description: string }[] = [
{ pattern: /git\s+push\s+--force(?!-with-lease-if)/i, description: "git push --force" },
{ pattern: /git\s+push\s+--force-with-lease/i, description: "git push --force-with-lease" },
{ pattern: /git\s+push\s+-f\b/i, description: "git push -f" },
{ pattern: /git\s+push\s+\+/, description: "git push with + force-refspec" },
{ pattern: /git\s+push\s+--mirror/i, description: "git push --mirror" },
{ pattern: /git\s+reset\s+--hard\b/i, description: "git reset --hard" },
{ pattern: /git\s+branch\s+-D\b/i, description: "git branch -D" },
{ pattern: /git\s+push\b[^"\n]*\s--delete\b/i, description: "git push --delete" },
{ pattern: /git\s+filter-branch/i, description: "git filter-branch" },
{ pattern: /git\s+filter-repo/i, description: "git filter-repo" },
{ pattern: /git\s+replace\b/i, description: "git replace" },
{ pattern: /\bgh\s+pr\s+merge/i, description: "gh pr merge" },
{ pattern: /mergePullRequest\s*\(/, description: "mergePullRequest GraphQL mutation" },
{ pattern: /mergeBranch\s*\(/, description: "mergeBranch GraphQL mutation" },
];

function* walk(dir: string): Generator<string> {
// Fail-closed: this is a CI safety guard. Swallowing readdir/stat errors
// would let a missing or unreadable scan root report `clean` and silently
Expand Down
7 changes: 7 additions & 0 deletions src/core/executor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { type ModelUsage, query, type SDKResultMessage } from "@anthropic-ai/cla
import { config } from "../config";
import type { BotContext, ExecutionResult, McpServerConfig, ModelUsageEntry } from "../types";
import { redactSecrets } from "../utils/sanitize";
import { createForbiddenBashHook } from "./hooks/forbidden-bash";

function isResultMessage(msg: unknown): msg is SDKResultMessage {
return (
Expand Down Expand Up @@ -266,6 +267,12 @@ export async function executeAgent({
// single fat tracking-comment dump. Block it so the model uses the eager
// tool list delivered in the SDK init message instead.
disallowedTools: ["ToolSearch"],
// Runtime backstop for the prompt-only destructive-action bans (issue #222):
// the agent runs bypassPermissions with Bash allowed, so this PreToolUse
// hook denies force-push / reset --hard / gh pr merge / GraphQL merge calls
// before they execute. Patterns are shared with the static check:no-destructive
// CI guard via src/utils/forbidden-bash.ts.
hooks: { PreToolUse: [{ matcher: "Bash", hooks: [createForbiddenBashHook(log)] }] },
// cwd is the cloned PR head (checkout.ts), which can carry an attacker's
// .claude/settings.json. Omitting settingSources makes the SDK load
// user/project/local filesystem settings by default (v0.3.x), so a
Expand Down
Loading
Loading