Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,20 @@ deploybot thread acknowledge --provider codex --thread-id "$CODEX_THREAD_ID" \
--notification-id "$DEPLOYBOT_NOTIFICATION_ID"
```

When `main` advances during a genuine repair, the next promotion pass records a
new `repair-required` event for the new base SHA even when the PR head and failure
text are unchanged. Every affected source owner can refresh in parallel; FIFO is
still enforced when repaired heads re-enter the merge queue.

Integration-conflict repair packets include the complete frozen pull-request and
head map. The elected owner must prove every frozen head is present before
resuming the cumulative integration pull request.

Token-authored integration PR `pull_request` runs are never accepted as exact CI
evidence. This includes GitHub's `action_required` zero-job placeholder:
DeployBot ignores it and dispatches the configured exact-branch
`workflow_dispatch` run itself. Failures in that owned run still fail closed.

DeployBot does not treat a registry comment as user notification. If native
delivery fails, an independent outbox entry stays visible under pending
`notifications`, even if the PR-opening thread starts new work, and the same
Expand Down
10 changes: 9 additions & 1 deletion adapters/claude-code/skills/deploybot/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@ normal repaired-PR return path because it verifies, unblocks, requeues, and
wakes atomically. Never merge an unlabeled pull request or treat a wake-up event
as trusted queue state.

Provider agents must never merge through a provider UI/API or push directly to
the base branch. Missing branch protection is not permission. A user's exact
`deploy` instruction authorizes the DeployBot request and designated coordinator,
not a side-door merge by the source agent.

## Coordinate Merges

Only the designated coordinator may call `promote_deployment_requests`,
Expand All @@ -87,7 +92,10 @@ cumulative base heads until CI, deployment, and configured health checks verify.

Genuine repair blocks may hold overlapping ready work for the configured bounded
repair window, but they remain merge-ineligible until the trusted source agent
resumes the freshly reviewed exact head.
resumes the freshly reviewed exact head. If `main` advances while a repair is
open, DeployBot records and emits a fresh repair handoff for the new base to every
affected source thread; begin those repairs in parallel while preserving FIFO for
the eventual merge.

Use `diagnose`/`deploybot doctor` for setup drift and `delivery_metrics` for p50,
p95, and slow-stage evidence. A failed cumulative CI or deployment pauses the
Expand Down
10 changes: 9 additions & 1 deletion adapters/codex/agent-merge-queue/skills/deploybot/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,11 @@ repaired-PR return path because it verifies, unblocks, requeues, and wakes
atomically. Never merge an unlabeled pull request or treat a wake-up event as
trusted queue state.

Provider agents must never merge through a provider UI/API or push directly to
the base branch. Missing branch protection is not permission. A user's exact
`deploy` instruction authorizes the DeployBot request and designated coordinator,
not a side-door merge by the source agent.

## Coordinate Merges

Only the designated coordinator may run `deploybot promote`, `deploybot react`,
Expand All @@ -82,7 +87,10 @@ verify.

Genuine repair blocks may hold overlapping ready work for the configured bounded
repair window, but they remain merge-ineligible until the trusted source agent
resumes the freshly reviewed exact head.
resumes the freshly reviewed exact head. If `main` advances while a repair is
open, DeployBot records and emits a fresh repair handoff for the new base to every
affected source thread; begin those repairs in parallel while preserving FIFO for
the eventual merge.

Use `deploybot doctor --json` for setup drift and `deploybot metrics --json` for
p50, p95, and slow-stage evidence. A failed cumulative CI or deployment pauses
Expand Down
8 changes: 8 additions & 0 deletions adapters/cursor/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ freeze the queue just to inspect it.
Only the user's exact `deploy` instruction authorizes `request_deployment` for
the current thread. DeployBot uses the recorded PR-opening Cursor thread; a
coordinator must never substitute its own thread ID. Never record prompt contents.
Never merge through Cursor, GitHub's merge API, or a direct push to the base
branch. This remains forbidden when branch protection is unavailable and when
the user says `merge`, `ship`, `fix it`, or `do it`; only the exact `deploy`
instruction authorizes a DeployBot request, and only DeployBot's designated
coordinator may perform the eventual merge. Updating a feature branch with the
base branch is allowed, but making that feature head reachable from the base
branch is itself a merge and is forbidden outside DeployBot.

Never poll, merge an unlabeled PR, or absorb unrelated work. Let the event worker
promote fresh exact heads, use one integration PR for overlaps or cumulative
validation, return repair packets to the source thread, atomically resume after
Expand Down
3 changes: 3 additions & 0 deletions docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,9 @@ Provider fields are:
| `ci_failure_grace_seconds` | Non-negative window for an exact-main CI retry to replace a failed attempt before the release fails. Default: 90. |
| `promotion_workers` | Positive maximum number of deploy requests promoted concurrently. Default: 4. |
| `repair_hold_minutes` | Positive maximum time that a genuine repair may hold overlapping ready work without becoming merge-eligible. Default: 60. |
| repair handoff refresh | When `main` changes during a conflict repair, DeployBot emits a new `repair-required` handoff with the new base SHA for each affected source owner while preserving the original bounded hold start. |
| integration repair packet | Includes `source_pull_requests` and the complete `source_heads` map so the elected owner can verify every frozen source before resuming the cumulative PR. |
| suppressed integration PR run | Integration `pull_request` runs, including `action_required` zero-job placeholders, are not exact CI evidence. DeployBot uses its own exact-branch `workflow_dispatch` run, whose real failures still fail closed. |
| `hold_merges_while_releasing` | Default `true`; after a merge, admit no newer batch until the release reaches the `release_admission` gate. |
| `release_admission` | How far an in-flight release must progress before the next batch is admitted; allowed: `verified` (default, safest) waits for the cumulative exact-main revision to be live, `ci-passed` reopens admission once exact-main CI is green while deploy and health checks keep following in the background. `ci-passed` trades a larger failure blast radius for throughput, and verification and notifications for a release may be emitted by a later reaction rather than the merging one. |
| `repair_branch_prefix` | Deterministic release-repair lease branch prefix; default `"deploybot/repair"`. |
Expand Down
10 changes: 9 additions & 1 deletion skills/deploybot/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@ normal repaired-PR return path because it verifies, unblocks, requeues, and
wakes atomically. Never merge an unlabeled pull request or treat a wake-up event
as trusted queue state.

Provider agents must never merge through a provider UI/API or push directly to
the base branch. Missing branch protection is not permission. A user's exact
`deploy` instruction authorizes the DeployBot request and designated coordinator,
not a side-door merge by the source agent.

## Coordinate Merges

Only the designated coordinator may call `promote_deployment_requests`,
Expand All @@ -87,7 +92,10 @@ cumulative base heads until CI, deployment, and configured health checks verify.

Genuine repair blocks may hold overlapping ready work for the configured bounded
repair window, but they remain merge-ineligible until the trusted source agent
resumes the freshly reviewed exact head.
resumes the freshly reviewed exact head. If `main` advances while a repair is
open, DeployBot records and emits a fresh repair handoff for the new base to every
affected source thread; begin those repairs in parallel while preserving FIFO for
the eventual merge.

Use `diagnose`/`deploybot doctor` for setup drift and `delivery_metrics` for p50,
p95, and slow-stage evidence. A failed cumulative CI or deployment pauses the
Expand Down
70 changes: 68 additions & 2 deletions src/agent_merge_queue/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -2134,6 +2134,8 @@ def create_integration_pull_request(
"branch": branch,
"conflict": conflict,
"batch_id": batch_id,
"heads": heads,
"pull_requests": pull_requests,
}
finally:
if staging_created:
Expand Down Expand Up @@ -2887,7 +2889,14 @@ def inspect(
timestamp = parse_time(entry.queued_at)
elapsed = (now - timestamp).total_seconds() if timestamp else None
if elapsed is not None and elapsed > queue_target:
active_gate = "; ".join(entry.reasons or []) or "merge worker"
detail = "; ".join(entry.reasons or [])
if client.config.blocked_label in entry.labels:
active_gate = (
"repair is blocked; source thread must resume"
+ (f": {detail}" if detail else "")
)
else:
active_gate = detail or "merge worker"
if not entry.reasons and entry.number in integration_numbers:
active_gate = integration_ci_active_gate(client, entry) or active_gate
alerts.append(
Expand Down Expand Up @@ -3353,8 +3362,12 @@ def record_repair(
intent: dict[str, Any] | None,
reason: str,
*,
base_sha: str | None = None,
resume_pull_request: int | None = None,
source_heads: dict[str, str] | None = None,
source_pull_requests: list[int] | None = None,
) -> dict[str, Any]:
current_base_sha = base_sha or client.base_sha()
comments = client.comments(entry.number)
previous = latest_payload(
comments,
Expand All @@ -3366,6 +3379,10 @@ def record_repair(
and previous.get("head_sha") == entry.head_sha
and previous.get("reason") == reason
and previous.get("intent_id") == (intent or {}).get("intent_id")
and previous.get("base_sha") == current_base_sha
and previous.get("repair_pull_request") == resume_pull_request
and previous.get("source_heads") == source_heads
and previous.get("source_pull_requests") == source_pull_requests
):
return previous
created_at = utc_now()
Expand All @@ -3381,7 +3398,7 @@ def record_repair(
or created_at
)
payload = {
"base_sha": client.base_sha(),
"base_sha": current_base_sha,
"created_at": created_at,
"head_sha": entry.head_sha,
"hold_started_at": hold_started_at,
Expand All @@ -3398,6 +3415,10 @@ def record_repair(
}
if resume_pull_request is not None:
payload["repair_pull_request"] = resume_pull_request
if source_heads is not None:
payload["source_heads"] = source_heads
if source_pull_requests is not None:
payload["source_pull_requests"] = source_pull_requests
client.comment(entry.number, repair_body(payload))
labels = client.labels(entry.number)
if client.config.blocked_label not in labels:
Expand Down Expand Up @@ -3435,6 +3456,19 @@ def record_integration_conflict_repair(
return None
integration_number = int(result["number"])
conflicting_number = int(conflict["number"])
frozen_heads_value = result.get("heads")
frozen_numbers_value = result.get("pull_requests")
if not isinstance(frozen_heads_value, dict) or not isinstance(
frozen_numbers_value, list
):
raise QueueError("integration repair packet is missing frozen membership")
frozen_heads = {
str(number): str(head_sha)
for number, head_sha in frozen_heads_value.items()
}
frozen_numbers = [int(number) for number in frozen_numbers_value]
if set(frozen_heads) != {str(number) for number in frozen_numbers}:
raise QueueError("integration repair packet has inconsistent frozen members")
owner: QueueEntry | None = None
owner_intent: dict[str, Any] | None = None
for entry in entries:
Expand Down Expand Up @@ -3467,6 +3501,8 @@ def record_integration_conflict_repair(
owner_intent,
reason,
resume_pull_request=integration_number,
source_heads=frozen_heads,
source_pull_requests=frozen_numbers,
)
result["repair_owner"] = {
"pull_request": owner.number,
Expand Down Expand Up @@ -3643,6 +3679,36 @@ def evaluate(
if label != client.config.blocked_label
]
else:
reasons = entry.reasons or []
if (
"pull request conflicts with main" in reasons
and deployment_repair_required(entry)
):
current_base_sha = client.base_sha()
if not repair or repair.get("base_sha") != current_base_sha:
reason = "; ".join(reasons or ["blocked"])
repair = record_repair(
client,
entry,
intent,
reason,
base_sha=current_base_sha,
)
entry.repair_overlap_hold = repair_overlap_hold_active(
client,
entry,
intent,
repair,
)
return (
"blocked",
{
"number": number,
"reason": reason,
"repair": repair,
},
entry,
)
return (
"waiting",
{
Expand Down
Loading