Skip to content

[codex] Prevent recurring deployment stalls#45

Draft
mberman84 wants to merge 1 commit into
mainfrom
codex/fix-deploy-slowdowns-20260621
Draft

[codex] Prevent recurring deployment stalls#45
mberman84 wants to merge 1 commit into
mainfrom
codex/fix-deploy-slowdowns-20260621

Conversation

@mberman84

Copy link
Copy Markdown
Contributor

What changed

  • reserve each exact CI/deployment-workflow dispatch with an atomic, per-release Git ref so concurrent followers cannot launch duplicate production runs
  • require every configured deployment workflow before a release is considered verified, while recovering unclaimed workflows after partial dispatch
  • skip integration source heads already contained in a cumulative branch and recover GitHub 204 no-op merge responses safely
  • reduce status API work for irrelevant drafts without hiding durable deploy intent, queue ownership, or actionable review threads
  • direct agents to the configured policy checkout instead of initializing accidental task-worktree policies

Why

Today's Astro deploy history exposed duplicate deployment dispatches, recursive/no-op cumulative integrations, API-quota-heavy status reads, and repeated setup delays when task worktrees lacked their policy file. The slowest queue sample reached 11,189 seconds while normal request-to-live releases were roughly 7-10 minutes.

Validation

  • 226 tests and 202 subtests passed
  • Ruff passed
  • sdist and wheel build passed
  • structured autoreview clean
  • live status benchmark improved from 18.95s to 15.46s on the same queue snapshot

This includes and supersedes the contained-source/no-op recovery from #40. Receipt routing remains isolated in #42.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant